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FOREWORD 


The  National  Center  for  Health  Services  Research 
and  the  National  Center  for  Health  Statistics  support 
research  in  survey  methods  in  order  to  increase  the 
validity  and  reliability  of  measures  of  health,  the 
availability  of  health  services,  and  the  use  of  health 
services.  While  much  technical  progress  has  been 
made  in  the  refinement  of  health  survey  methods  and 
measures,  the  dissemination  of  the  state  of  the  art 
to  the  general  health  services  research  community  re- 
mains problematic,  and  there  is  a  need  to  identify 
needs  and  priorities  for  continued  research  activities. 

Recognizing  these  needs,  the  two  Centers  jointly 
sponsored  this  invitational  conference  to  bring  to- 
gether leading  researchers  in  health  survey  meth- 
odology. Participants  were  charged  to  review  the  cur- 
rent state  of  the  art  of  health  survey  research  and 
to  identify  areas  and  issues  for  continuing  research. 
It  is  hoped  that  this  digest  of  conference  proceedings 
will:  (a)  acquaint  health  services  researchers  whose 
primary  skills  are  not  in  survey  methods  with  the 
limitations  and  difficulties  inherent  in  health  surveys, 
and  (b)  apprise  researchers  whose  interests  and  skills 
are  in  the  area  of  health  survey  methodology  of  the 


research  needs  and  priorities  identified  by  conference  i 
participants. 

This  report  could  not  have  made  its  timely  ap- 
pearance without  the  dedicated  efforts  of  our  internal 
staff  and  consultant  planning  group.  It  is  our  pleasure 
to  acknowledge  the  generous  assistance  of  Sherman 
Williams,  Joseph  de  la  Puente,  William  Lohr,  Wil- 
liam Ki tching,  Rita  Delmont,  Linda  McCleary,  Juan- 
ita  Locke  and  Annabelle  Ridenour,  National  Center 
for  Health  Services  Research;  Robert  Fuchsberg, 
Monroe  Sirken,  Elijah  White  and  Nancy  Pearce, 
National  Center  for  Health  Statistics;  Kirk  Wolter, 
Bureau  of  the  Census;  and  Leo  Reeder,  Seymour  Sud- 
man,  Ronald  Andersen,  Charles  Cannell,  Floyd  Fow- 
ler, Bernard  Greenberg,  and  Daniel  Horvitz,  of  the 
non-Federal  planning  group.  The  major  credit  for 
the  success  of  this  effort  lies  with  the  participants  of 
the  conference,  the  many  skilled  and  dedicated  indi- 
viduals who  have  committed  themselves  to  the  im- 
provement of  health  services  in  the  United  States. 
The  listing  of  conference  participants  in  Appendix 
B  is  token  recognition  of  their  outstanding  contribu- 
tion. 


Dorothy  P.  Rice 
Director,  NCHS 


Gerald  Rosenthal,  Ph.D. 
Director,  NCHSR 


INTRODUCTION 


Brief  Historical  Overview 

Survey  research  has  a  long  and  honorable  tra- 
dition. Its  roots,  especially  in  health,  go  back  to  the 
population  surveys  in  France  in  the  late  18th  Cen- 
tury and  to  the  Medical  Polizie  in  Germany  (Rosen, 
1972) .  Later,  in  the  late  19th  Century,  this  method 
of  systematically  collecting  data  from  populations  or 
samples  of  populations  through  the  use  of  personal 
interviews  was  elaborated  by  the  British  social  sur- 
veys of  Booth  and  others.  Perhaps  in  no  other  coun- 
try, however,  has  the  survey  method  reached  such  a 
broad  range  of  application  as  in  the  United  States. 

Although  sample  surveys  have  spread  throughout 
the  world,  the  American  experience  is  particularly 
broad  and  versatile.  This  is  due  in  no  small  measure 
to  the  fact  that  opinion  and  attitude  surveys,  pro- 
fessional and  amateur,  are  an  integral  part  of  the 
administrative  structure  of  power  in  both  political 
and  business  life. 

The  systematic  study  of  errors,  bias,  and  other 
problems  associated  with  the  application  of  the  sur- 
vey method  began  in  the  early  1930's  with  the  work 
of  the  U.S.  Department  of  Agriculture  and  the  U.S. 
Bureau  of  the  Census.  These  agencies  provided  im- 
petus to  problems  of  statistical  sampling  and  other 
measurement  problems.  The  development  of  tech- 
niques of  scaling  of  attitudes  developed  by  Cantril, 
Likert,  Stouffer,  and  Lazarsfeld  as  well  as  Mosteller's 
work  provided  great  impetus  to  the  adoption  of  sur- 
veys and  public  opinion  polling  in  the  thirties  and 
early  forties. 

Perhaps  the  largest  and,  certainly,  one  of  the 
most  impressive  programs  of  survey  research  was 
conducted  during  World  War  II  by  the  Research 
Branch  of  the  Information  and  Education  Division 
of  the  War  Department.  The  behavioral  scientists 
associated  with  this  organization  carried  out  over  300 
separate  studies  on  army  personnel  covering  a  broad 
array  of  topics.  In  what  has  become  almost  a  classic 
case  of  the  adoption  of  social  policy  based  upon 
survey  research,  the  Research  Branch  studies  of  de- 
mobilization priorities  of  troops  led  to  the  adoption 
of  the  so-called  "point  system"  of  demobilization. 
Other  social  surveys  carried  out  during  this  period  for 


other  Government  agencies  also  are  illustrative  of  the 
application  of  this  technique  for  social  policy  pur- 
poses. 

A  principal  benefit  of  these  surveys  was  their 
strong  contribution  to  the  methodology  of  survey  re- 
search. These  studies  addressed  themselves  to  prob- 
lems of  sampling,  questionnaire  construction,  and 
interviewing.  Several  volumes  and  a  large  number  of 
research  papers  were  subsequently  published  that 
have  had  a  significant  impact  upon  the  use  of  sample 
survey  by  not  only  Government  but  also  business, 
industry,  the  mass  media,  and  a  wide  variety  of  agen- 
cies in  the  health  and  welfare  fields. 

The  Use  of  Surveys  in  Health  Research 

In  the  field  of  health,  the  survey  method  has 
become  a  major  tool  for  the  systematic  collection  of 
health-related  data.  It  is  used  by  epidemiologists, 
statisticians,  medical  care  and  health  services  re- 
searchers, medical  sociologists,  health  economists, 
psychologists,  and  of  course,  various  Government 
agencies.  In  the  mid-thirties,  the  United  States  Public 
Health  Service  undertook  what  was  perhaps  the  first 
Government-sponsored  survey  of  the  Nation's  health. 
But  it  was  not  until  the  establishment  of  the  National 
Center  for  Health  Statistics  that  systematic,  periodic 
health  interview  surveys  were  undertaken.  This  agency 
immediately  addressed  itself  to  the  methodological 
problems  of  reliability  of  interview  responses,  validity, 
and  other  nonsampling  measurement  issues  involved 
in  the  Health  Interview  Study.  To  this  day,  NCHS 
is  vitally  concerned  with  the  methodological  im- 
provement of  its  surveys  in  order  to  improve  the 
quality  of  the  data  obtained  from  respondents. 

In  addition,  the  National  Center  for  Health  Serv- 
ices Research  has  supported  a  considerable  number  of 
extramural  research  projects  aimed  at  improving  the 
quality  of  health  surveys.  Much  of  this  activity  occurs 
in  the  course  of  substantively  oriented  research  proj- 
ects -such  as  the  work  of  the  Human  Population 
Laboratory  at  the  California  State  Department  of 
Public  Health;  the  Washington  Heights  Master  Sam- 
ple Survey  of  Columbia  University;  the  Los  Angeles 
Metropolitan  Area  Survey  at  UCLA;  the  research  of 


the  Center  for  Health  Administration  Studies  at  the 
University  of  Chicago;  the  Survey  Research  Center 
of  the  University  of  Michigan,  and  others.  Finally, 
the  National  Institutes  of  Health,  the  Social  Security 
Administration,  National  Science  Foundation,  and 
others  have  contributed  in  recent  years  to  the  body 
of  knowledge  concerning  survey  methodology  es- 
pecially as  related  to  health.  Indeed,  it  is  difficult  to 
separate  health  and  non-health  survey  methodology; 
survey  methods  developed  by  statisticians  or  sociol- 
ogists have  direct  relevance  for  health  surveys  and 
likewise,  epidemiological  survey  methods  are  often 
of  critical  interest  to  social  scientists. 

Despite  the  considerable  advance  in  survey  meth- 
odology, it  should  be  noted  that  systematic  studies 
of  the  methods  of  this  research  tool  are  of  recent 
origin.  There  is  much  to  be  done  and  the  profes- 
sional researchers  of  the  many  disciplines  that  use 
this  method  are  most  sensitive  to  the  limitations  of 
the  method.  Much  has  been  written  on  the  methods 
and  procedures  of  survey  research;  there  is  also  a 
considerable  body  of  literature  on  the  problems  of 
survey  research  in  the  field  of  health  and  health  serv- 
ices. The  material  discussed  in  the  present  volume  is 
not  intended  as  introductory  material;  rather  it  is 
hoped  to  add  to  the  existing  basic  knowledge  in  the 
field. 

Planning  for  This  Conference 

The  present  conference,  sponsored  by  the  Na- 
tional Center  for  Health  Services  Research  and  the 
National  Center  for  Health  Statistics,  developed  out 
of  a  series  of  symposia  and  workshops  that  these  two 
units  held  during  the  past  two  to  three  years.  Among 
other  things,  these  small  meetings  addressed  them- 
selves to  specific  topics  such  as:  the  use  of  diaries  as 
a  memory  aid  in  retrieving  data  from  respondents; 
scales  to  measure  the  dimensions  of  patient  satisfac- 
tion; sample  designs  and  data-collection  strategies, 
and  so  on.  In  discussions  between  several  participants 
of  these  meetings  and  staff  of  the  National  Center  for 
Health  Services  Research,  the  need  for  a  national  in- 
vitational conference  was  proposed  as  a  way  to  syn- 
thesize the  state  of  the  art  with  respect  to  certain  key 
methodological  concerns  and  to  identify  needs  and 
priorities  for  additional  research. 

Subsequently,  a  planning  committee  was  ap- 
pointed, and  this  committee  determined  that  the  most 
useful  format  for  a  national  conference  would  be  a 


relatively  small  number  of  invited  participants  utiliz- 
ing a  semi-structured  program.  Thus,  no  papers  were 
to  be  prepared  for  this  meeting. 

Rather,  the  planning  committee  prepared  an 
agenda  of  four  major  topics  that  included  a  number 
of  salient  issues  under  each  topic.  A  planning  com- 
mittee member  served  as  the  chairman  for  one  topic 
on  the  agenda;  each  chairman  invited  a  specific  in- 
dividual to  serve  as  the  rapporteur  or  recorder  for 
his  session.  Each  of  the  major  topics  was  given  ap- 
proximately one-half  day  for  open  or  free  discussion. 
The  objectives  of  this  conference  were: 

1.  To  identify  the  critical  methodological  issues 
or  problem  areas  for  health  survey  research 
and  the  state  of  the  art  or  knowledge  with 
respect  to  these  problems. 

2.  What  types  of  research  problems  need  to  be 
given  high  priority  for  research  funding. 

3.  To  identify  policy  issues  that  can  be  addressed 
by  survey  research  scientists. 

4.  To  communicate  the  results,  recommenda- 
tions, and  implications  of  this  conference  to: 

(a)  the   broader  community   of  health  re- 
searchers who  use  survey  methods; 

(b)  relevant   Government   agencies   and  in- 
dividuals; 

(c)  other  potential  users  of  these  results  of 
this  conference. 

We  hope  that  this  report  of  the  conference  will 
be  found  useful  by  those  who  read  it.  No  attempt  has 
been  made  to  present  a  verbatim  transcript;  rather, 
the  chairman  and  recorder  of  each  session  have  pre- 
pared a  report  that  presents  the  various  issues  that 
were  discussed,  the  comments  made  about  them  by 
various  participants,  and  a  summary  and  recommen- 
dations. 

This  report  is  tentatively  planned  as  Volume  1 
of  a  series  of  such  conference  proceedings  on  advances 
in  health  survey  research  methods.  It  is  hoped  that 
conferences  and  reports  such  as  this  will  occur  on  a 
biennial  or  triennial  basis. 
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Introduction 

Attempting  to  summarize  a  technical  working 
conference  is  always  a  difficult  task;  it  is  doubly  so 
in  the  present  instance  because  the  content  of  the 
conference  was  so  rich  in  substantive  contributions. 
Conference  participants  are  generally  not  noted  for 
their  post-conference  enthusiasm;  the  present  meet- 
ing was  a  rare  exception  in  that  several  participants 
took  the  time  to  write  letters  not  only  to  the  Con- 
ference Planning  Committee  but  also  to  high-level 
government  officials  favorably  commenting  on  the 
format  and  outcome  of  the  conference.  In  the  final 
analysis,  however,  the  usefulness  of  any  conference  is 
determined  by  its  products,  only  a  portion  of  which 
is  visible  in  the  form  of  a  proceedings  such  as  this 
volume. 

The  conference  was  organized  to  discuss  the  state 
of  the  art  of  knowledge  in  certain  critical  methodolo- 
gic  areas  of  health  survey  research  and  to  identify 
problems  for  high  priority  research  on  non-sampling 
measurement  errors.  Four  main  topics  were  selected 
for  review:  (1)  research  instruments;  (2)  interview- 
ing; (3)  problems  of  validity;  (4)  total  survey  design. 
Critical  methodological  issues  in  each  of  these  areas 
were  isolated  and  discussed  with  reference  to  current 
state  of  knowledge  and  the  needs  for  further  research. 

In  many  respects,  health  surveys  have  advantages 
over  surveys  in  other  areas.  As  the  conference  partici- 
pants noted,  health  surveys  are  usually  more  complex 
and  are  longer  than  other  surveys.  Despite  the  com- 
plexity and  length,  respondents  participate  in  sub- 
stantially high  proportions,  and  frequently  enjoy  the 
opportunity  to  respond  to  questions  about  their 
health.  The  importance  of  this  commitment  and  will- 
ingness to  participate  in  health  surveys  should  not  be 
understated  because  such  a  commitment  provides  an 
unusual  environment  and  opportunity  for  health  in- 
vestigations and  simultaneously  places  a  serious  ob- 
ligation on  the  shoulders  of  investigators  provided 
with  the  time  and  trust  of  the  respondent. 

In  order  to  fulfill  our  responsibilities  to  our 
scientific  colleagues  and  to  participating  respondents, 
the  conference  gave  special  emphasis  to  the  concept  of 
total  survey  design  which  attempts  to  provide  a 


framework  for  the  assessment  of  survey  errors  and  1 
their  cost  components. 

Total  survey  design  is  a  concept  that  facilitates 
the  planned  allocation  of  resources  towards  the  op- 
timal reduction  of  the  total  error  of  estimate.  The 
urgent  need  for  survey  results  is  dramatically  por- 
trayed by  the  number  of  dollars  being  invested  to  con- 
duct the  surveys,  and  the  policy  decisions  being  made 
as  a  consequence  of  the  availability  or  lack  of  avail- 
ability of  results.  Similar  investments  in  the  develop- 
ment of  more  efficient  surveys  are  also  dramatic  in 
their  paucity.  The  fact  is  that  each  component  of 
error:  sampling,  response,  interviewers,  and  their  in- 
teractions, are  manageable  at  a  coit  and  do  affect  the 
results  of  every  survey.  While  studies  addressing  single 
component  of  error  are  important,  consideration  of 
all  components  of  error,  as  an  integral  part  of  such 
studies,  is  essential. 

Throughout  the  conference,  there  was  consensus 
that  while  the  results  of  survey  research  document 
the  methods  for  controlling  and  monitoring  various 
specific  sources  of  errors,  the  continued  development 
of  an  information  matrix  depicting  not  only  methods 
but  costs  should  be  actively  sponsored. 

Knowledge  and  application  of  survey  methodo- 
logy is  an  indispensable  ingredient  in  the  orderly 
growth  of  health  services  research.  Most  of  the  useful 
products  of  research  depend  on  the  applicability,  va- 
lidity, and  reproducibility  of  survey  results.  Indeed, 
policy  decisions  are  being  made  today  on  the  basis  of 
evidence  obtained  by  conducting  national,  area,  and 
local  surveys.  Paradoxically,  those  responsible  for  de- 
signing, conducting,  analyzing,  and  providing  the  re- 
sults of  surveys,  seldom  have  all  the  specialized  ex- 
pertise vital  to  the  successful  design,  conduct  and  re- 
porting of  the  surveys  being  conducted. 

The  questions  of  validity  and  reliability  often 
focus  upon  sampling  errors  as  the  major  measurement 
issue,  whereas  non-sampling  measurement  errors  are 
equally  important.  While  many  journal  articles  re- 
port sampling  errors,  very  little  attention  is  given  to 
the  reporting  of  non-sampling  errors.  This  conference 
focused  particularly  on  the  importance  of  addressing 
non-sampling  measurement  errors. 


Surveys  are  conducted  through  a  variety  of  me- 
dia: mail,  telephone,  and  face-to-face  interviews  as 
well  as  through  several  variants  and  combinations  of 
these.  The  conference  did  not  attempt  to  address  all 
of  these  forms  of  surveys  in  a  systematic  fashion,  how- 
ever, since  face-to-face  and  telephone  interviews  are 
the  most  frequent  types  of  survey  used  in  health  stud- 
ies. Mail  surveys  are  used  more  often  for  limited 
initial  or  follow-up  to  a  personal  interview. 

This  report  is  being  developed  to  serve  two 
groups  of  health  services  research  specialists:  those 
engaged  in  methodological  research  in  health  or  other 
areas  and  those  investigators  who  are  the  users  of 
survey  research  methodology  in  their  substantive 
work. 

Obtaining  valid  and  reliable  data  from  respon- 
dents, i.e.,  various  kinds  of  publics  of  interest  to 
health  researchers,  has  always  been  a  matter  of  con- 
cern to  research  investigators.  While  considerable 
progress  has  been  made  in  the  development  of  better 
procedures  to  elicit  valid  and  reliable  responses  from 
our  various  study  populations,  certain  problems  re- 
main to  be  solved.  This  conference  afforded  a  rare 
opportunity  for  an  exchange  of  knowledge  concerning 
a  variety  of  issues  including:  use  of  proxy  respondents 
in  obtaining  data;  procedures  to  aid  recall  (such  as 
use  of  diaries  and  memory  aids)  ;  questionnaire 
length;  use  of  the  telephone  and  so  on.  Basically,  the 
conferees  were  concerned  with  improvements  in  the 
quality  of  the  data  obtained  from  survey  respondents 
through  the  use  of  such  data-collection  instruments 
and  several  recommendations  were  made  of  problem 
areas  where  methodological  research  should  be  given 
priority. 

There  is  another  fundamental  aspect  to  data-col- 
lection in  survey  research,  namely,  the  quality  of  the 
interviewing  process  itself  and  the  relationship  be- 
tween interviewers  and  respondents.  It  became  clear 
that  telephone  interviewing  was  a  feasible  alternative 
to  face-to-face  interviewing  but  questions  remain  con- 
cerning sample  representativeness,  interviewer  charac- 
teristics, and  on  the  quality  of  the  data  obtained. 
Clearly,  this  procedure  requires  additional  research 
and  merits  high  priority  in  methodological  studies  to 
improve  health  surveys.  In  addition  to  reaching  con- 
sensus on  the  usefulness  of  the  telephone  in  inter- 
viewing, the  conference  participants  also  agreed  that 
racial  matching  of  interviewers  and  respondents  was 
unimportant  when  non-racial  issues  were  the  subject 
of  the  investigation.  But  the  problem  of  relative  status 
differentials  between  interviewers  and  respondents 
merits  further  study.  One  of  the  least  understood  facts 
in  survey  research  is  that  the  quality  of  interviewing 
can  affect  the  data  as  much  or  more  than  response 
rates,  sample  design,  and  so  on.  The  conference  in- 
dicated that  too  little  is  known  about  the  role  of  in- 
terviewer behavior  on  interview  results.  There  is  vir- 
tually no  systematic  body  of  research  data  on  the 
evaluation  of  differential  interviewer  training  strate- 


gies so  that  appropriate  guidelines  for  better  metho- 
dology can  be  established. 

Governmental  agencies  supporting  studies  using 
the  survey  method  might  give  serious  consideration 
to  the  development  of  a  set  of  guidelines  to  be  used 
by  research  grant  applicants  or  contractors.  Such 
guidelines  can  assist  applicants  to  adhere  to  estab- 
lished "good  practices"  in  survey  methods  and  proce- 
dures. 

A  recurrent  problem  in  survey  research,  particu- 
larly studies  concerned  with  the  collection  of  sensitive 
or  confidential  data,  is  the  issue  of  validity.  How  can 
bias  caused  by  deliberate  or  unintentional  untruthful 
reporting  be  reduced  or  perhaps  eliminated?  Several 
procedures  were  discussed  at  this  conference  that 
suggest  reasonable  avenues  for  accomplishing  such  a 
goal.  Such  procedures  as  randomized  response,  coding 
systems,  multiple  respondent  or  network  surveys,  and 
so  on,  were  considered  and  evaluated.  Clearly,  these 
procedures  have  much  utility  in  survey  research  and 
merit  wider  applicability.  Nevertheless,  considerable 
additional  research  appears  to  be  required.  For  ex- 
ample, we  know  very  little  about  the  acceptability  of 
these  alternative  procedures  to  the  respondent;  more- 
over, our  knowledge  is  scanty  with  respect  to  the  util- 
ity of  these  newer  procedures  in  other  than  face-to- 
face  interviews  such  as  mail  questionnaires  and  tele- 
phone interviews.  The  conference  participants  were 
also  concerned  about  the  effects  of  recent  legislation 
aimed  at  protecting  the  privacy  of  individuals  on  the 
legitimate  research  and  validation  procedures  which 
have  been  used  by  statisticians  and  others  for  decades. 
Certain  pitfalls  were  discussed  in  studies  involved 
with  record  linkage  as  a  means  of  checking  validity. 

A  major  issue  in  survey  research  concerns  the 
matter  of  costs.  Although  professional  survey  re- 
searchers are  aware  of  this  issue,  it  tends  to  be  con- 
sidered independent  of  other  variables  in  the  design 
and  conduct  of  methodological  studies.  The  con- 
ference highlighted  a  concept,  Total  Survey  Design 
(TSD),  that  can  be  operationalized  and  used  effec- 
tively by  investigators  to  measure  the  cost  components 
of  given  measurement  designs.  An  information  matrix 
was  suggested  for  determining  survey  error  and  cost 
components.  This  session,  in  particular,  generated 
considerable  discussion  concerning  the  relative  lack 
of  adequate  funds  committed  toward  sophisticated  re- 
search on  various  components  of  health  survey  re- 
search methods. 

A  major  concern  elicited  in  the  TSD  discussion 
was  the  widespread  variability  in  terminology  and 
definitions  of  major  methodological  concepts.  As  a 
step  toward  clarification  of  this  issue  and  to  promote 
standardization,  the  Conference  Planning  Group  for- 
mulated definitions  for  concepts  used  in  each  session; 
thus,  the  glossary  has  been  provided  in  this  volume. 
Obtaining  consensus  on  the  definition  of  such  con- 
cepts is  central  to  advancing  the  TSD  strategy  and 
provide  the  data  needed  to  improve  survey  work  gen- 


erally.  It  might  be  noted,  parenthetically,  that  this 
effort  follows  the  lead  of  the  Social  Science  Research 
Council's  Center  for  the  Coordination  of  Social  In- 
dicators' recent  report  on  the  standardization  of  cer- 
tain common  personal  background  data  typically  col- 
lected in  surveys. 

Clearly,  this  conference  did  not  solve  or  even 
highlight  all  of  the  problems  of  survey  research.  It  did 
provide  a  forum  in  which  survey  methodologists  and 
professional  users  of  the  method  in  health  research 
could  exchange  ideas,  and  agree  on  certain  problem- 
atic issues,  and  suggest  new  lines  of  inquiry.  Because 
the  publication  lag  is  sufficiently  long  to  inform  the 
community  of  researchers  of  new  developments  and 
methodological  findings,  conferences  such  as  this  one 
are  an  important  means  of  scientific  communications. 

Finally,  much  of  the  content  of  this  conference 
has  sometimes  been  perceived  by  mission  agencies  in 
the  Federal  Government  as  "basic"  or  "fundamental" 
and  thus  of  low  priority  for  them  to  encourage  re- 
search or  award  grants  for  methodological  work.  But, 
as  the  conference  forcefully  demonstrates,  such  work 
has  a  special  interest  for  health  services.  Although 
other  agencies,  such  as  the  National  Science  Founda- 
tion, have  a  mission  to  sponsor  investigations  that  are 
contributory  to  the  body  of  knowledge  concerning  the 
substantive  and  statistical  bases  of  the  surveys,  mis- 
sion-oriented Federal  agencies  have  an  obligation  also 
to  undertake  such  research  programs.  If  health  poli- 
cies and  programs  are  to  rest  upon  a  sound  data  base, 
the  mission-oriented  agencies  have  an  obligation  to 
support  methodological  survey  research.  (See  Support 
of  Basic  Research  by  Mission  Agencies  National 
Science  Foundation,  National  Science  Board,  NSB- 
74-225,  October  23,  1974.) 

The  following  issues  merit  special  consideration: 

A.  There  are  several  trade-offs  to  be  considered  in 
determining  the  length  of  the  recall  period  to  be 
used  in  a  survey.  Some  of  the  considerations  are 
often  epidemiological  in  nature.  If  the  attribute 
to  be  recorded  happens  to  be  an  unusual  event, 
one  would  need  a  rather  long  period  of  recall  to 
obtain  a  robust  numerator.  On  the  other  hand, 
as  one  increases  the  length  of  the  recall  period, 
telescoping  and  other  sources  of  error  may  plague 
the  survey. 

B.  There  was  general  agreement  that  certain  inter- 
views can  easily  last  from  one  to  two  hours  with- 
out serious  effects  to  the  respondent.  The  major 


problem  in  this  area  appears  to  be  that  of  inter- 
viewer fatigue. 

C.  Advantages  and  disadvantages  of  telephone  inter- 
views were  discussed.  The  advantages  presented 
had  a  decided  edge  over  disadvantages.  The 
major  disadvantages  encompassed  sample  prob- 
lems and  frame  development.  Various  strategies 
designed  to  reduce  errors  due  to  these  sampling 
problems  were  presented.  It  was  the  consensus 
that,  with  the  proper  strategy,  telephone  inter- 
views can  be  efficient.  There  are  many  advantages 
of  being  able  to  use  telephone  interviews  in 
health  surveys.  Among  the  advantages  discussed 
were  the  following: 

1.  One    can    assess    interviewer    performance  3 
through  monitoring. 

2.  One  can  interview  in  areas  where  interviewers 
would  hesitate  to  go,  particularly  during  the 
evening  when  one  may  want  to  reach  working 
members  of  the  household. 

3.  Health  professionals  are  more  likely  to  par- 
ticipate in  telephone  surveys. 

4.  Interviewer  restrictions  (mobility,  transporta- 
tion)  are  substantially  lessened. 

5.  Field  costs  are  reduced. 

D.  Trade-offs  between  additional  training  of  inter- 
viewers and  compensation  of  respondents  were  dis- 
cussed. The  participants  agreed  that  there  are 
greater  gains  in  the  quality  of  data  obtained  by 
additional  training  of  interviewers  than  by  com- 
pensating the  respondents. 

E.  The  position  was  taken  that  unless  the  survey 
addresses  racial  issues,  there  is  very  little  evidence 
that  the  race  of  the  interviewer  has  any  effect  on 
the  response. 

F.  It  was  agreed  that  there  is  no  simple  or  direct 
method  for  measuring  response  bias  and  that  this 
is  a  very  critical  issue.  One  method  of  addressing 
this  problem  is  that  of  assessing  the  internal  con- 
sistency of  records.  To  this  effect,  the  complexity 
of  record  linkage  and  methods  for  deciding  which 
is  the  valid  record  were  discussed.  The  principal 
use  of  the  information  to  be  obtained  should  de- 
termine the  priorities  existing  towards  reducing 
the  range  of  errors  that  have  to  be  considered  in 
observing  the  principle  of  Total  Survey  Design. 


POLICY 
ISSUES 
AND 

COMMUNICATION 
OF 

RESULTS 


Dr.  Leo  G.  Reeder,  Chairman  of  the  Conference, 
introduced  Dr.  Gerald  Rosenthal,  Dr.  Bernard  Green- 
berg,  and  Dr.  Daniel  Horvitz  as  a  panel  to  lead  the 
discussion  of  policy  issues  and  communication  of  re- 
search results.  Dr.  Reeder  emphasized  the  importance 
of  providing  the  results  of  methodological  research 
in  health  surveys  to  the  larger  body  of  researchers  in 
health  services  research  and  the  need  for  an  explicit 
recognition  of  the  requirements  of  support  of  meth- 
odological research. 

Dr.  Rosenthal,  Director  of  the  National  Center 
for  Health  Services  Research  led  the  initial  discussion. 
He  indicated  that  the  importance  of  methodological 
research  in  health  surveys  is  fundamental  to  the  via- 
bility of  the  research  for  several  reasons:  (1)  A  sig- 
nificant proportion  of  analytic  work  in  health  services 
research  is  based  on  survey  data;  (2)  The  quality  of 
initial  requests  for  research  support  is  diminished  by 
inadequacies  in  design  and  inappropriate  specification 
of  data  pertinent  to  the  research  issues;  (3)  The  anal- 
yses of  data  developed  by  surveys  is  often  deficient 
because  we  cannot  correct  for  errors  in  measurement; 
(4)  The  evaluation  of  the  demonstration  efforts  of 
the  National  Center  require  baseline  and  follow-up 
surveys  (to  obtain  the  data  for  evaluation  purposes) 
and  the  responsible  persons  may  not  be  sufficiently 
acquainted  with  surveys  to  conduct  a  proper  evalua- 
tion; (5)  There  is  a  need  for  improved  health  sur- 
veys in  terms  of  the  time  frame  of  the  research  in  or- 
der to  avoid  delays  in  the  completion  of  the  studies. 
Investment  in  the  overall  design  of  the  survey  could 
result  in  significant  savings  and  improved  quality  of 
the  data  being  obtained. 

It  is  estimated  that  55  to  65  percent  of  the  re- 
search supported  by  the  National  Center  is  based 
upon  obtaining  data  through  survey  research.  Yet,  in 
terms  of  the  methodological  research  to  improve  the 
quality  of  the  data  being  obtained,  we  are  probably 
not  making  a  sufficient  investment.  We  frequently  use 
very  refined  statistical  techniques  with  all  their  own 
assumptions  and  limitations  on  survey  data  without 
worry  to  correcting  for  measurement  errors.  It  is  a 
pleasure  to  learn  of  the  work  of  investigators  in  the 
methodological  aspects  of  survey  work  and  the  ad- 
vances being  made  in  improving  the  quality  of  data. 


The  communication  and  use  of  research  findings 
is,  in  essence,  the  major  reason  for  the  existence  of 
the  National  Center.  It  is  a  function  that  is  recognized 
as  critical.  The  special  consideration  in  the  dissemina- 
tion of  methodological  research  should  make  the  com- 
munication problem  a  little  more  amenable  to  solu- 
tion than  research  findings  destined  for  the  world  of 
health  policy  and  decision-makers.  The  target  audi- 
ence is  much  more  identified,  and  it  is  a  group  with 
which  we  relate  actively  on  a  repetitive  basis.  As  a  re- 
sult, the  strategies  for  communication  give  more  prom- 
ise of  success.  The  National  Center  is  actively  in- 
terested in  exploiting  the  value  of  methodological 
research  to  improve  the  overall  quality  of  health  serv- 
ices research.  It  is  envisioned  that  the  brokering  of 
methodological  research  results  will  be  pursued  in  a 
variety  of  ways  including:  direct  technical  consulta- 
tion, publications,  and  specific  targeted  training.  The 
National  Center  for  Health  Services  Research  is  the 
obvious  place  for  the  research  community  to  look  for 
help  in  survey  research  in  health  services,  and  we 
ought  to  be  available  and  visible.  The  questions  raised 
by  the  conference  are  important  ones  and  the  results 
of  the  deliberations  should  themselves  be  widely  dis- 
seminated. 

Further  discussion  of  the  issues  by  the  conference 
participants  resulted  in  elaboration  of  both  policy 
issues  and  important  concerns  regarding  the  dissemi- 
nation of  research  results.  It  was  noted  that  methodo- 
logical research  conducted  to  date  has  produced  a  body 
of  knowledge  that  should  be  more  widely  applied  in 
health  services  research.  There  are  also  many  signifi- 
cant questions  and  issues  to  be  addressed  by  meth- 
odological research. 

Another  way  of  improving  the  quality  of  data  is 
to  promote  secondary  analysis  of  the  data  by  investi- 
gators other  than  those  who  originally  collected  the 
data.  This  may  be  especially  important  in  large-scale 
national  surveys  and  in  evaluations  of  important  so- 
cial programs  initiated  through  demonstrations  or 
experiments.  It  was  pointed  out  that  we  frequently 
seem  unwilling  to  invest  a  few  extra  thousand  dol- 
lars in  secondary  analyses  when  we  have  invested 
many  more  dollars  and  several  years  to  acquire  the  in- 
formation. 


In  the  remainder  of  this  chapter,  we  briefly  out- 
line the  policy  issues  that  emerged  in  the  Conference. 
It  is  generally  recognized  that  a  substantial  invest- 
ment of  federal  funds  is  devoted  to  the  collection  of 
data  through  the  use  of  the  survey  method.  The  uses 
to  which  the  data  are  put  may  or  may  not  often  re- 
sult in  policy  decisions  but  it  is  fair  to  state  that 
ideally,  this  would  be  the  case.  As  noted  by  Dr.  Ro- 
senthal, there  is  an  insufficient  investment  in  meth- 
odological research  to  improve  the  quality  of  the  data 
obtained  through  surveys.  The  Conference  itself  will 
no  doubt  have  some  impact  upon  this  situation  by 
calling  attention  to  the  need  for  increased  attention 
to  methodological  issues.  Thus,  we  can  summarize 
the  policy  issues  as  follows: 

1.  There  is  a  need  for  increased  investment  in  the 
methodological  aspects  of  health  surveys,  par- 
ticularly non-sampling  problems. 

2.  Communication  of  research  findings  to  the  broad 
community  of  users  of  health  data  is  essential. 
Government  agencies  such  as  the  National  Cen- 
ter for  Health  Services  Research  may  act  as 
brokers  in  disseminating  research  results  of  meth- 
odological studies  through  a  variety  of  mecha- 
nisms. 

3.  There  is  a  need  for  greater  collaboration  between 
federal  agencies  such  as  the  NCHSR,  NCHS,  and 
NIH  to  advance  the  quality  of  data  collected 
through  surveys.  Both  the  mission  agencies  of  the 
federal  government  and  the  agencies  concerned 
with  so-called  "basic"  research  make  substantial 
investments  in  substantive  investigations  using 
survey  research  methods  of  data  collection.  In- 
vestigators use  this  method  to  make  contributions 
to  the  body  of  knowledge  regarding  scores  of  sub- 
stantive issues  of  direct  concern  to  such  agencies 
as  the  National  Institutes  of  Mental  Health; 
Child  Health  and  Human  Development;  Can- 
cer; Heart  and  Lung;  as  well  as  others  in  NIH. 
Similarly,  the  mission  agencies  such  as  NCHSR 
and  NCHS  also  utilize  the  survey  method  for 
substantive  purposes.  Ultimately,  the  generation 
of  this  knowledge  is  intended  to  improve  the  na- 
tion's health  and  hence,  the  quality  of  the  data 
base  is  of  critical  importance.  Clearly,  with  such 
common  goals,  these  health  agencies  have  an 
obligation  to  support  methodological  studies 
aimed  at  the  improvement  of  the  state  of  the  art. 

4.  Government  agencies  supporting  studies  using 
the  survey  method  might  give  serious  considera- 
tion to  the  development  of  a  set  of  guidelines  to 
be  used  by  research  grant  applicants  or  contrac- 
tors. Such  guidelines  can  assist  applicants  to  ad- 
here to  established  "good  practices"  in  survey 
methods  and  procedures.  Thus,  new  technique, 
as  well  as  standard  approaches  in  instruments 
construction  as  they  apply  to  health  surveys  might 
be  distributed  in  some  systematic  fashion  to  those 
applying  for  grants  and  contracts  involving  health 


survey  work.  Moveover,  such  materials  might  also 
be  made  available  to  project  officers  in  the  grant- 
ing agencies  (NCHSR,  NCHS,  NIH,  NSF,  and 
so  on) .  Perhaps,  a  clearinghouse  could  be  estab- 
lished similar  to  the  NCHS  Clearinghouse  on 
Health  Status  Indices.  Obviously,  the  more  "ex- 
perienced" investigator  would  benefit  from  such 
a  project,  as  would  the  novice. 

5.  Although  the  development  of  a  guide  of  stand- 
ards would  be  useful  to  the  less-experienced  sur- 
vey researcher,  it  must  be  recognized  that  survey 
research  is  a  complex  research  methodology  which 
is  deceptively  simple  to  the  uninitiated.  In  terms 
of  cost  effectiveness,  as  a  protection  of  respon- 
dents, and  the  quality  of  data,  surveys  should  be 
undertaken  only  by  those  who  are  skilled  in  the 
methods  or  who  have  expert  consultants  readily 
available. 

6.  In  order  to  assess  research  results  and  to  advance 
methodological  procedures,  survey  investigations 
should  contain  a  statement  of  procedures  used  in- 
cluding: interviewer  training  and  supervisory 
procedures,  other  quality  control  methods,  and 
response  rates.  The  development  of  standardized 
guidelines  for  monitoring  and  reporting  the  qua- 
lity of  the  data  collection  would  be  particularly 
useful. 

7.  Explicit  attention  to  the  issue  of  validity  should 
be  given  in  all  projects  involving  data  collection. 
For  example,  validation  checks  should  be  strongly 
encouraged  where  feasible,  as  well  as  comparison 
of  the  results  with  data  from  other  studies,  and  so 
on.  Such  checks  would  provide  useful  data  for 
evaluating  the  validity  of  the  findings  as  well  as 
increase  our  knowledge  about  validity  and  the 
factors  that  affect  it.  Nowhere  is  research  more 
urgently  needed  than  in  the  area  of  confidential 
data  and  sensitive  issues,  because  the  problems 
of  refusal  to  answer  queries  and  untruthful  re- 
porting seriously  affect  the  validity  of  the  data. 
There  is  a  special  need  to  encourage  more  re- 
search on  novel  techniques  and  methods  of  im- 
proving validity  in  the  area  of  confidential  data 
and  sensitive  issues. 

8.  The  use  of  computers  and  the  linkage  of  records 
has  caused  problems  regarding  violation  of  con- 
fidentiality and  the  invasion  of  privacy.  This  has 
given  rise  to  the  publication  of  "Records,  Com- 
puters and  the  Rights  of  Citizens"  by  the  DHEW 
Secretary's  Advisory  Committee  on  Automated 
Personal  Data  Systems  in  July,  1973.  Many  of  the 
recommendations  in  this  report  are  sound  and 
fundamentally  just.  There  are,  however,  other 
recommendations  which  would  prohibit  even  the 
matching  of  death  certificates  with  birth  certifi- 
cates and,  when  implemented,  essentially  prohibit 
valuable  health  research  that  has  been  conducted 
for  decades  and  which  did  not  violate  confiden- 
tiality. It  is  essential  that  professional  organiza- 


tions  interested  in  health  research  and  survey  re- 
search express  their  position  regarding  some  of 
the  provisions  of  this  Act  to  the  Committee  on 
National  Statistics  of  the  National  Academy  of 
Sciences.  The  American  Statistical  Association 
and  the  American  Sociological  Association  have 
recently  established,  with  three  other  professional 
societies,  an  Ad  Hoc  Committee  on  Government 
Statistics;  the  American  Statistical  Association 
also  has  an  Ad  Hoc  Committee  on  Privacy  and 
Confidentiality.  This  body  should  also  be  ap- 
prised of  the  possible  implications  of  the  Privacy 
Act  of  1974. 

The  value,  implications,  and  application  of  the 
Total  Survey  Design  (TSD)  concept  should  be 
made  more  readily  available  to  the  survey  re- 
search community.  Conferences,  research  and 
training  programs,  and  publication  of  special 
monographs  can  be  very  instrumental  not  only  in 
this  instance  but  also  with  respect  to  other  issues 
discussed  here. 

Presupposing  standardization  of  terms  and  de- 
finitions, an  information  system  containing  per- 


tinent data  on  the  various  error  components  and 
cost  components  associated  with  specific  measure- 
ment designs  used  in  sample  surveys  should  be 
established  for  use  by  the  survey  research  com- 
munity. Perhaps  the  establishment  of  a  national 
clearinghouse,  as  noted  in  Policy  Recommenda- 
tion No.  4  above,  would  be  the  vehicle  develop- 
ing, maintaining,  and  disseminating  such  infor- 
mation. 

11.  The  National  Center  for  Health  Statistics,  and 
the  National  Center  for  Health  Services  Research 
should  sponsor  an  annual  Summer  Session  ad- 
dressing survey  research  in  health  including  the 
review  of  the  state  of  the  art  and  analytical  tech- 
niques and  processes  pertinent  to  recent  ap- 
proaches to  survey  research  in  health. 

The  discussion  ended  with  agreement  that  the 
suggestions  made  were  worthy  of  serious  attention. 
The  conference  has  produced  numerous  questions 
meriting  research  attention  and  the  strategies  for  dis- 
semination seem  to  hold  promise.  Operational  plans 
for  proceeding  in  both  of  these  areas  deserve  imme- 
diate attention  and  adequate  resources. 
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Introduction 

Researchers  collecting  health  data  by  survey 
methods  are  fortunate  on  the  one  hand,  because  most 
respondents  find  discussing  their  health  and  medical 
care  interesting  and  important.  On  the  other  hand, 
medical  studies  increasingly  require  complex  informa- 
tion, thereby  increasing  the  burden  on  the  respondent 
and  making  data  problems  more  likely.  In  this  ses- 
sion, the  major  emphasis  was  on  how  questionnaires 
and  other  survey  instruments  might  be  designed  to 
reduce  respondent  burden  and  increase  the  quality 
of  data. 

In  the  initial  planning  for  the  session,  four  topics 
were  included:  1)  respondent  burden;  2)  length  of 
time  to  complete;  3)  standardized  modules  or  stand- 
ardized measures;  and  4)  effects  of  instrument  com- 
plexity on  different  groups.  In  the  actual  discussion 
that  took  place,  the  first  two  and  last  two  topics 
seemed  to  group  themselves  naturally.  Thus,  they  are 
combined  in  this  summary.  Length  of  the  interview 
is  really  one  example  of  a  burden  on  respondents. 
Also  the  use  of  standardized  questions  and  modules 
assumes  that  these  modules  are  understood  and  re- 
acted to  similarly  by  all  different  groups  of  interest. 

This  summary  of  the  proceedings  tries  to  capture 
the  flavor  of  the  remarks,  but  the  interpretations  are 
the  responsibility  of  the  chairman  and  recorder.  Be- 
cause we  are  also  deeply  interested  in  these  issues,  our 
experiences  and  biases  may  creep  in.  Wherever  pos- 
sible, however,  we  have  tried  to  identify  the  people 
with  their  ideas  so  that  interested  readers  may  con- 
tact the  conference  participants  direct  for  additional 
information  or  clarification. 

The  general  framework  is  to  present  a  problem 
in  collecting  health  data  and  give  some  indication  of 
the  nature  of  that  problem.  Then,  the  current  proce- 
dures used  to  handle  these  problems  are  described. 
Reference  is  made  to  research  on  these  problems, 
some  of  which  is  either  still  in  process  or  has  not  yet 
been  widely  adopted.  Finally,  we  end  with  a  list  of 
unsolved  problems  and  a  future  research  "want  list." 


Respondent  Burden 

Respondent  burden  concerns  the  level  of  demand 
placed  on  the  respondent  necessary  to  answer  the  sur- 
vey instrument  questions.  As  the  burden  increases,  a 
number  of  problems  become  more  serious  for  the  re- 
search community:  1)  the  population  may  become 
more  alienated  toward  survey  research,  and  level  of 
cooperation  for  future  studies  may  decline;  2)  the  re- 
liability of  the  data  collected  may  be  lessened;  and 
3)  response  rate  in  subsequent  interviews  in  panel 
studies  may  decline.  The  following  discussion  of  re- 
spondent burden  refers  mainly  to  general  population 
surveys,  although  many  of  the  problems  and  attempts 
to  solve  them  are  also  germane  to  surveys  of  physi- 
cians, other  health  services  providers,  and  administra- 
tors in  the  health  delivery  field. 

The  major  types  of  burden  discussed  in  the  ses- 
sion on  health  survey  instruments  related  to  issues  of 
1)  recall  period;  2)  salience  of  information  requested; 
3)  frequency  of  events  the  respondent  is  asked  to  re- 
port on;  4)  use  of  proxy  respondents;  5)  complexity 
of  the  instrument;  and  6)  length  of  the  instrument. 
Each  of  these  topics  will  be  treated  below  and  an 
attempt  will  be  made  to  summarize  the  problems  as- 
sociated with  each  and  the  procedures  that  were  sug- 
gested to  ameliorate  the  problem. 

Recall  Period 

The  major  problem  with  asking  respondents 
about  events  that  happened  six  months  to  a  year  or 
more  prior  to  the  interview  is  that  respondents  are 
likely  to  forget  details  of  the  events  or  even  the  oc- 
currence of  the  event  itself.  There  are  trade-offs  in 
reducing  the  length  of  the  recall  period,  however 
(Horvitz) .  The  costs  of  data  collection  increase.  Tele- 
scoping or  reporting  of  events  that  actually  occurred 
outside  of  the  accepted  recall  period  may  also  in- 
crease. 

The  National  Center  for  Health  Statistics 
(NCHS) ,  in  weighing  the  memory  loss  with  a  long 
recall  period  against  the  days  of  experience  lost  with 
a  short  recall  period,  arrived  at  a  two-week  reference 


period  for  physician  and  dental  visits.  Studies  con- 
ducted by  the  Survey  Research  Center  at  the  Univer- 
sity of  Michigan  have  shown  that  reporting  of  hos- 
pitalization using  a  twelve-month  recall  period  is  less 
complete  for  hospitalizations  that  occurred  early  in 
the  recall  period  (Cannell  and  Fowler:  1965) .  Con- 
sequently, NCHS  is  currently  calculating  hospitaliza- 
tion rates  using  only  hospitalizations  reported  in  the 
preceeding  six  months.  However,  they  continue  to 
collect  data  on  hospitalizations  for  an  entire  year.  A 
shorter  recall  period,  such  as  six  months;  appears  to 
result  in  some  telescoping  (Fuchsberg) .  The  Current 
Medicare  Survey  uses  periods  of  one  month  to  study 
hospital  re-admissions.  Some  reporting  omissions  are 
found  for  short,  overnight  stays  for  diagnostic  tests 
(Scharff) . 

The  issue  was  raised  as  to  whether  there  is  fall-off 
in  reporting  even  with  a  recall  period  as  short  as  two 
weeks  for  physician  visits.  The  answer  appears  to  be 
yes  (Fuchsberg).  Nonetheless,  NCHS  judged  the  costs 
would  be  too  great  if  the  recall  period  were  reduced 
to,  say,  a  week  for  physician  visits. 

Another  problem  with  a  long  recall  period  is 
how  to  account  for  the  experiences  of  people  who 
died  during  the  recall  period  (Horvitz) .  NCHS  has 
sponsored  methodological  studies  to  determine  the 
bias  introduced  by  deaths  in  the  population  during 
the  recall  period  (Sirken) .  Using  a  six-month  recall 
period,  Horvitz  showed  that  some  of  the  losses  in  re- 
porting were  due  to  death  with  the  rest  due  to  mem- 
ory problems.  The  problem  is  more  serious  if  the 
event  being  studied  is  correlated  with  deaths,  as  is 
the  case  with  hospitalizations,  than  if  the  event  is  not 
correlated,  as  is  the  case  with  dental  visits  (Sirken) . 
In  the  former  case,  the  loss  is  an  increasing  monotonic 
function  of  the  recall  period. 

The  bounded  interview  is  one  means  sometimes 
used  to  decrease  the  effects  of  telescoping  that  result 
from  a  shorter  recall  period  (Neter  and  Waksberg: 
1964)  (Jabine) .  This  process  involves  a  baseline  in- 
terview and  follow-up  interviews  which  solicit  the 
reporting  of  events  that  happened  subsequent  to  the 
first  interview.  Sudman  conducted  a  recent  study  in- 
volving an  initial  interview  with  a  three-month  recall 
period  and  three  subsequent  monthly  interviews  con- 
cerning physician  visits  and  disability  days.  This  pro- 
cedure compared  the  bounded-interview  approach  and 
diary  method  for  the  three  monthly  interviews 
(Sudman,  Wilson,  and  Ferber:  1974).  The  results 
show  telescoping  can  be  eliminated  by  reminding 
people  of  what  they  said  in  earlier  periods.  Omissions 
in  recall,  however,  still  occur. 

The  diary  was  suggested  as  a  possible  solution  to 
the  omission  problem  (Sudman) .  While  major  events 
such  as  hospitalizations  are  less  likely  to  be  forgotten, 
the  diary  approach  is  most  helpful  in  aiding  recall  of 
physician  and  dental  visits  and  disability  days.  The 
work  of  Mooney  was  cited  as  an  early  but  still  useful 


comparison  of  bounded  interview  and  diary  ap- 
proaches (Mooney:  1962)    (Woolsey) . 

Variations  in  the  use  of  the  diary  were  discussed. 
In  some  cases,  the  diary  is  a  relatively  complete  form 
which  is  filled  out  by  the  respondent  and  is  used  as 
the  primary  document  for  subsequent  data  processing 
(Survey  Research  Laboratory,  University  of  Illinois) . 
In  other  instances,  the  diary  or  calendar  form  is  less 
formalized.  It  is  used  primarily  as  a  memory  aid  for 
the  respondent,  and  an  interview  schedule  is  subse- 
quently filled  out  by  an  interviewer  and  used  as  the 
processing  document  (Johns  Hopkins  Medical  Eco- 
nomics Study  and  the  Rand  Health  Insurance  Study) . 
A  major  advantage  of  the  more  complete  diary  tech- 
nique is  that  it  reduces  the  time  and  cost  necessary  to 
collect  the  data.  It  might  also  stimulate  more  com- 
plete recording  of  events.  The  use  of  the  diary  as  a 
memory  aid,  seemingly,  would  also  allow  the  collec- 
tion of  more  complex  forms  of  information. 

Assumptions  about  ability  to  retrieve  information 
lost  through  memory  decay  influence  the  approaches 
used  to  combat  such  loss  (Marquis).  If  the  assump- 
tion is  made  that  information  lost  through  memory 
decay  cannot  be  recalled,  the  general  approach  is  sim- 
ply not  to  ask  the  respondent  for  that  information, 
but  to  attempt  to  collect  it  through  some  other 
source.  An  alternative  view  is  that  little  that  has  been 
experienced  cannot  be  recalled  with  the  proper  mem- 
ory aids.  In  that  latter  case,  much  more  effort  is 
likely  to  be  devoted  to  recall  techniques. 

A  calendar  year  is  sometimes  used  for  the  recall 
period.  One  assumed  advantage  of  the  calendar-year 
recall  period  has  been  that  it  corresponds  to  certain 
types  of  records,  such  as  income  tax,  and  to  tradi- 
tionally defined  intervals,  such  as  yearly  salary.  Also, 
advantages  accrue  if  events  are  remembered  as  occur- 
ring before  or  after  the  start  of  the  New  Year.  The 
Current  Population  Survey,  however,  has  compared 
weekly  and  monthly  reporting  of  income  with  yearly 
income  reporting,  but  found  little  difference  (U.S. 
Bureau  of  the  Census,  1963)  (Gerson) . 

Given  the  differences  in  reporting  according  to 
length  of  recall  period,  the  question  of  why  the  mean 
estimates  on  utilization  of  health  services  produced  by 
the  Center  for  Health  Administration  Studies  and  the 
National  Opinion  Research  Center  (CHAS-NORC) 
using  a  one-year  recall  period  have  been  similar  to 
those  produced  by  the  National  Health  Survey  using 
much  shorter  recall  periods.  Procedures  in  the  CHAS- 
NORC  studies  that  might  make  estimates  similar  to 
those  from  NCHS  include:  1)  incorporating  into  the 
estimates  verifying  information  provided  by  hospital 
physicians  and  insurance  companies;  2)  encouraging 
respondents  to  consult  records  of  income  tax,  doctor 
bills,  .and  insurance  policies;  and  3)  using  aided  recall 
methods,  such  as  having  people  report  utilization 
separately  for  each  episode  of  illness  experienced  dur- 
ing the  year  and  the  number  of  times  a  particular  doc- 
tor was  seen  for  each  episode  (Andersen,  et  al,  1976) . 


Salience  of  Event 

In  general,  it  has  been  assumed  that  the  more 
salient  the  event  the  less  burden  placed  on  the  re- 
spondent in  reporting  it.  Respondents  who  pay  fees 
directly  for  the  physician  visits  may  see  these  events 
as  more  salient  than  those  who  get  care  at  no  direct 
cost.  There  is  some  evidence  that  fee-for-service  visits 
are  more  likely  to  be  reported  (Fuchsberg) .  Similarly 
in  the  Rand  Health  Insurance  Study,  completeness  of 
reporting  of  doctor  visits  was  compared  to  the  level  of 
reimbursement  people  received  for  their  medical  costs 
(Marquis) .  Those  who  were  to  receive  reimbursement 
for  the  visits  they  reported  would  probably  view  these 
medical  events  as  more  salient  than  those  who  were 
not.  Indeed,  when  the  information  was  collected 
monthly,  the  findings  were  that  persons  reimbursed 
for  their  visits  were  more  likely  to  report  their  visits 
than  those  who  were  not.  However,  there  was  no 
difference  using  a  weekly  diary.  This  finding  suggests 
the  need  to  use  a  short  recall  period  for  events  con- 
sidered less  salient  by  the  respondent,  whereas  a  longer 
one  can  be  tolerated  for  the  more  salient  events. 

Generalizations  about  salience  still  must  be  made 
with  caution,  however.  For  example,  some  events  that 
are  salient  but  extremely  painful  may  be  repressed.  An 
example  given  was  the  underreporting  of  infant  death 
in  social  surveys  when  compared  to  vital  statistics 
records  (Horvitz) . 

Frequency  of  Event 

The  more  frequently  or  common  the  event  the 
more  difficult  it  may  be  to  recall,  particularly  if  the  re- 
spondent is  required  to  recall  specific  events  and  de- 
tails. For  example,  nutrition  surveys  asking  people 
what  they  have  eaten  show  considerable  memory  loss 
with  recall  periods  of  as  short  as  a  week  (Greenberg) . 
Diaries  may  prove  particularly  useful  for  this  type  of 
event. 

Proxy  Respondent 

A  proxy  respondent  is  often  very  useful  to  sum- 
marize information  and  provide  details  for  a  family 
member  not  available  for  an  interview.  A  proxy  re- 
spondent is  essential  in  certain  instances,  such  as  when 
soliciting  information  on  young  children,  on  seri- 
ously ill  or  senile  persons,  and  on  deceased  family 
members. 

For  many  types  of  questions,  however,  proxy  re- 
porting appears  to  be  less  valid,  because  events  are  not 
as  salient  for  the  proxy  or,  in  some  cases,  because  the 
proxy  simply  does  not  have  the  necessary  information. 
For  example,  personal  expenditure  data  tends  to  be 
less  well  reported  by  proxies  than  by  self  respondents 
(Sudman  and  Ferber:  1970). 

For  certain  other  types  of  information,  such  as 
reasons  for  hospital  admission,  a  reasonably  informed 
proxy  serves  about  as  well  as  the  self  respondent.  For 
diseases  that  are  socially  undesirable  or  that  involve 


considerable  threat  to  the  individual,  such  as  alcohol- 
ism, diabetes  and  cancer,  proxy  reporting  may  actually 
be  superior  (Sirken) . 

It  is  important  to  define  the  kind  of  proxy  re- 
spondent. Having  a  spouse  respond  for  the  subject  is 
very  different  from  having  any  other  related  person 
in  the  family  serve  as  proxy.  Having  the  head  of  the 
family  report  on  family  expenditures  and  income  is 
likely  to  be  more  helpful  than  using  just  any  avail- 
able adult.  The  results  depend  on  the  proxy's  relation- 
ship to  the  subject,  the  kinds  of  questions  asked,  and 
the  length  of  the  recall  period.  All  of  these  factors 
must  be  considered  in  deciding  whether  or  not  a 
proxy  will  be  an  acceptable  respondent.  It  appears 
that,  except  when  using  "sensitive"  questions  or  col- 
lecting information  on  children,  senile  persons,  and 
others  who  would  have  difficulty  remembering  and 
communicating,  respondent  burden  is  probably  re- 
duced by  questioning  the  subject  himself  rather  than 
a  proxy. 

Complexity 

The  impression  of  many  experienced  researchers 
is  that  health  questionnaires,  on  average,  are  complex. 
Respondents  are  often  requested  to  provide  details  on 
a  variety  of  subjects  such  as  expenditures  for  medical 
care;  utilization  of  hospitals,  physicians,  dental  care 
and  drugs;  sequence  of  events  in  an  episode  of  illness; 
perceptions  of  symptoms  and  illness  conditions,  com- 
prehensive accounts  of  their  perceptions;  and  evalu- 
ations of  their  medical  care.  When  these  types  of  in- 
formation, particularly  in  combination,  are  solicited 
from  the  respondent,  the  burden  may  be  excessive. 

One  strategy  to  reduce  this  burden  and,  at  the 
same  time,  to  hold  costs  down  is  to  use  simple,  less 
complex  questionnaires  in  the  initial  study  phases  on 
a  particular  topic  (Dalenius)  .  The  complex  details 
would  be  collected  in  later  phases  and,  then,  only  on 
the  issues  which  seem  warranted  by  preliminary  anal- 
yst. For  example,  if  frequency  of  events  in  the  pre- 
liminary phase  is  low,  the  reliability  and/or  validity 
appear  unacceptable,  or  if  adequate  data  is  supplied 
by  the  simple  form  to  answer  the  research  question, 
then  using  more  complex  questions  might  not  be  in- 
dicated. 

It  was  suggested  that  this  "incremental  approach" 
is  somewhat  analogous  to  calibration  techniques  used 
by  physicists  (Dalenius)  .  Ideally,  a  point  might  be 
reached  for  those  measures  that  require  complex  in- 
struments for  acceptable  validity  where  the  complex 
form  would  need  to  be  used  only  on  a  subset  of  the 
data.  The  results  from  this  subset  might  then  be  used 
to  adjust  the  major  data  set  collected  by  the  simple 
method.  This  approach  would  have  the  double  bene- 
fits of  reducing  collection  costs  and  the  burden  on  the 
respondents. 

One  cannot  assume  that  the  results  achieved  from 
calibrating  one  measure,  such  as  disability,  can  be 


applied  to  other  measures  such  as  doctor  visits  or  den- 
tist visits  (Sirken) .  The  reliability  and  validity  of  the 
different  measures  may  vary  considerably,  and  each 
important  measure  should  be  calibrated  separately. 
One  study  of  people  discharged  from  hospitals  shows 
little  correlation  in  reporting  error  for  different  items 
including  diagnosis,  length  of  stay,  and  date  of  ad- 
mission and  discharge  (Cannell  and  Fowler:  1965) . 

It  should  also  be  noted  that  the  validity  of  the 
measure  may  differ  according  to  the  order  of  the  ques- 
tions asked  and  the  other  kinds  of  information  that 
are  collected  in  a  given  instrument.  For  example,  some 
differential  reporting  of  disability  days  might  be  ex- 
pected according  to  whether  or  not  other  questions 
about  illness  experience  preceded  those  about  dis- 
ability days. 

Length  of  Instrument 

There  was  general  agreement  that  personal  inter- 
views that  lasted  an  hour  or  less  caused  no  serious 
problems  in  health  surveys  whereas  interviews  lasting 
over  two  hours  caused  major  problems  with  both  re- 
spondents and  interviewers  due  to  fatigue  (Fuchsberg, 
Hensler,  Kulley,  Losciuto,  White,  and  Woolsey).  There 
were  mixed  opinions  about  problems  with  interviews 
lasting  from  one  to  two  hours,  but  consensus  that 
problems  increased  when  the  interview  length  in- 
creased from  one  to  two  hours.  Sirken  noted  that  it 
was  not  sufficient  to  observe  the  average  length  of 
interview,  inasmuch  as  interviews  that  averaged  45 
minutes  might  require  three  hours  for  some  re- 
spondents. 

Length  of  the  questionnaire  was  perceived  as  an 
even  more  critical  problem  in  self-administered  forms 
(Barnes,  Boisen,  Bradburn,  and  Dillman)  .  Here  it  is 
the  number  of  pages  rather  than  the  length  of  time 
to  complete  that  influences  cooperation  rate.  Dillman 
reported  substantial  reduction  in  both  quality  and 
cooperation  with  mail  samples  of  general  populations 
when  the  questionnaire  was  longer  than  12  pages 
(Dillman,  et  al:  1974) . 

The  chief  method  for  dealing  with  length  is  by 
conscious  efforts  to  cut  out  questions  that  are  interest- 
ing, but  not  critical.  Another  suggested  method  is  the 
use  of  matrix  sampling  in  which  not  all  respondents 
are  asked  all  questions.  That  is,  using  a  rotating  order, 
each  respondent  is  asked  only  a  subset  of  all  the  ques- 
tions of  interest  (Horvitz  and  Waksberg) . 

Meyers  stressed  the  importance  of  being  honest 
with  respondents  and  telling  them  in  advance  how 
long  the  interview  might  last.  Cannell  suggested  that 
it  was  the  subjective,  not  the  objective,  length  of  the 
interview  that  was  more  important.  If  the  respondent 
enjoyed  the  interview  and  the  topics  were  interesting, 
the  interview  would  seem  shorter.  Even  with  an  in- 
teresting interview,  however,  extremely  long  question- 
naires caused  major  difficulties  (Woolsey) . 


Effects  of  Instrument  Complexity  on  Different  Groups 

In  the  abstract,  the  use  of  standardized  modules 
or  measures  seems  to  be  highly  desirable.  Not  only  is 
it  wasteful  to  re-invent  the  wheel,  but  doing  so  makes 
it  difficult  or  impossible  to  compare  the  results  of 
different  studies  if  different  ad  hoc  forms  are  used. 
Thus,  the  questionnaires  used  in  the  National  Health 
Interview  and  the  model  Neighborhood  Health  Cen- 
ters and  the  CHAS-NORC  National  Health  Expendi- 
tures and  Utilization  questionnaire  provide  models  to 
be  followed. 

Most  of  the  discussion,  however,  focused  on  the 
limitations  of  standardized  questionnaires  and  scales 
for  different  populations.  There  was  general  agree- 
ment that  questionnaires  that  work  well  with  middle- 
class  respondents  may  have  serious  problems  when 
used  with  lower  education  respondents.  Other  re- 
spondent characteristics  such  as  age,  ethnicity,  social 
status,  and  acquiescence  were  also  discussed.  It  appears 
that  little  is  being  done,  however,  in  most  surveys  to 
handle  some  of  the  problems  raised  and  there  seems 
to  be  only  limited  research  in  progress  or  contem- 
plated, although  research  in  this  area  is  both  vitally 
needed  and  feasible. 

Starting  with  education,  it  is  obviously  impossible 
to  get  accurate  response  if  the  respondent  does  not 
understand  the  words  used  or  the  meaning  of  the 
question.  It  is  often  impossible  to  detect  this  on  ex- 
amination of  a  finished  questionnaire,  but  this  lan- 
guage barrier  is  more  readily  evident  if  the  interview 
is  observed  or  if  one  listens  to  a  tape  recording.  The 
problem  becomes  worse  as  one  moves  from  behavioral 
to  attitudinal  questions  and  from  the  more  specific 
to  the  global.  Several  examples  were  cited  of  mail  sur- 
veys in  which  the  quality  of  the  data  declined  with 
successive  mailings.  The  major  reason  was  that  less 
educated  respondents  were  more  likelv  to  respond  on 
later  waves  and  also  to  have  more  difficulty  with  a 
self-administered  questionnaire  (Dillman  and  Sirken) . 

The  current  procedures  of  pretesting  question- 
naires prevent  some  of  the  more  serious  problems,  but 
often  standardized  questions  or  modules  are  assumed 
to  be  satisfactory  with  limited  or  no  testing.  In  some 
cases,  particularly  dealing  with  attitudinal  and  per- 
sonality scales,  these  forms  were  standardized  on  col- 
lege students  and  it  is  dangerous  to  assume  that  they 
are  valid  for  other  less  educated  populations. 

The  current  procedures  standardize  on  wording, 
but  the  content  may  be  perceived  differently  by  re- 
spondents with  different  levels  of  education.  The  con- 
verse procedure  of  attempting  to  standardize  by  con- 
tent is  extremely  difficult  and  no  reports  of  its  use 
were  presented  at  the  conference,  although  this  meth- 
od is  in  wider  use  in  cross-cultural  studies. 

Age  differences  are  also  found  in  standardized 
questions.  Since  both  memory  and  energy  decline  for 
persons  above  middle  age,  questions  that  can  be 
answered  accurately  by  younger  persons  are  much 


more  difficult  for  the  aged.  Although  these  problems 
are  recognized  in  studies  that  focus  entirely  on  the 
aged,  they  are  generally  ignored  in  studies  of  the  total 
population.  Solutions  could  include  reducing  the 
length  of  the  recall  period,  so  as  to  lessen  the  burden 
on  the  respondent's  memory  and  energy,  or  obtaining 
data  on  this  population  from  the  records  of  medical 
providers. 

Obtaining  data  from  medical  providers  presents 
special  problems.  The  chief  methods  for  securing  a 
high  degree  of  cooperation  and  accurate  response  from 
providers  are  to  have  the  study  sponsored  by  an  organi- 
zation the  provider  respects  and  to  make  clear  to 
the  provider  group  the  positive  applications  for  the 
findings. 

Ethnicity  differences  also  affect  standardized  ques- 
tions. Different  ethnic  groups  use  different  forms  of 
para-medical  help  such  as  curanderos  by  Spanish  and 
felshers  by  Slavic  respondents.  Certain  types  of  symp- 
toms and  folk  medicine  practices  are  also  unique  to 
some  ethnic  groups.  Particularly,  if  native  language 
questionnaires  were  used  instead  of  considering  the 
non-English-speaking  respondent  as  a  refusal,  these 
special  topics  might  be  included  in  the  questionnaire. 

The  lengthiest  discussion  related  to  respondent 
social  and  psychological  characteristics  and,  partic- 
ularly, acquiescence.  As  an  example,  the  work  of  Carr 
was  cited  as  indicating  the  effect  of  acquiescence  on 
the  Srole  anomie  scale  (Carr:  1971).  Since  acquiescence 
is  highly  related  to  the  respondent's  social  class,  much 
of  the  data  that  show  increased  alienation  in  lower 
class  black  respondents  may  be  caused  or  contami- 
nated by  acquiescence. 

Woolsey  gave  examples  of  memory  errors  that 
were  generally  consistent  across  respondents  over  sev- 
eral events.  Most  respondents  who  were  asked  to  recall 
the  date  of  important  news  events  tended  to  telescope 
the  dates  forward.  A  subgroup,  however,  consistently 
did  the  reverse.  Exactly  what  the  characteristics  are  of 
respondents  who  err  in  different  directions  is  not 
understood. 

The  need  for  social  approval  is  another  variable 
that  varies  by  respondent  and  influences  response. 
One  unexpected  consequence  of  the  need  for  social 
approval  is  the  overreporting  of  some  items  by  mem- 
bers of  subgroups  that  most  respondents  would  under- 
report.  The  smoking  of  marijuana,  sexual  experiences 
and  participating  in  deviant  activities  may  be  over- 
reported  by  teenagers  whose  peers  admire  such  activ- 
ities (Boruch) . 

Overreporting  of  socially  desirable  activities  is  re- 
duced by  using  less  personal  methods  such  as  self- 
administered  forms  and  telephone,  instead  of  face-to- 
face,  interviews.  Acquiescence  can  be  reduced  by  mak- 
ing the  questions  more  specific  and  less  global.  No 
method,  however,  completely  eliminates  underreport- 
ing of  highly  threatening  events,  although  this  prob- 
lem can  be  alleviated  by  methods  suggested  in  later 
sessions. 


References 

1.  Andersen,  Ronald,  Joanna  Lion  and  Odin  W. 
Anderson 

1976  Two  Decades  of  Health  Service:  Social 
Survey  Trends  in  Use  and  Expendi- 
tures, Cambridge,  Massachusetts:  Bal- 
linger  Publishing  Co. 

2.  Cannell,  Charles  F.  and  Floyd  J.  Fowler 

1965  "Comparison  of  Hospitalization  Re- 
porting in  Three  Survey  Procedures," 
Vital  and  Health  Statistics,  National 
Center  for  Health  Statistics,  DHEW 
Publication  No.  1000,  Series  2,  No.  8, 
Washington,  D.  C:  U.S.  Government 
Printing  Office. 

3.  Carr,  Leslie  G. 

1971  "The  Srole  Items  and  Acquiescence," 
American  Sociological  Review,  Vol.  36, 
No.  2  (April) ,  pp."  287-293. 

4.  Dillman,  Don,  Edwin  Carpenter,  James  Christen- 
son  and  Ralph  Brooks 

1974  "Increasing  Mail  Questionnaire  Re- 
sponse, A  Four  State  Comparison," 
American  Sociological  Review,  Vol.  39, 
No.  5  (October) ,  pp.  744-756. 

5.  Elinson,  Jack,  E.  Padilla  and  M.  E.  Perkins 

1967  Public  Image  of  Mental  Health  Serv- 
ices, New  York  City:  Mental  Health 
Materials  Center,  Inc.,  for  N.Y.C.  Com- 
munity Mental  Health  Board. 

6.  Haberman,  Paul  W. 

1976  "Psychiatric  Symptoms  Among  Puerto 
Ricans  in  Puerto  Rico  and  New  York 
City,"  Ethnicity,  Vol.  3,  in  press. 

7.  Mooney,  H.  William 

1962  Methodology  in  Two  California  Health 
Surveys,  Public  Health  Monograph  No. 
70,  Washington,  D.C.:  U.S.  Govern- 
ment Printing  Office. 

8.  Neter,  John  and  Joseph  Waksberg 

1964  "A  Study  of  Response  Errors  in  Expen- 
ditures Data  from  Household  Inter- 
views," Journal  of  the  American  Statis- 
tical Association,  Vol.  59,  No.  305 
(March) ,  pp.  18-55. 

9.  Sudman,  Seymour  and  Robert  Ferber 

1970  Experiments  in  Obtaining  Consumer 
Expenditures  in  Durable  Goods  by  Re- 
call Procedures,  Urbana,  Illinois:  Sur- 
vey Research  Laboratory,  University  of 
Illinois. 

10.  Sudman,  Seymour,  Wallace  Wilson  and  Robert 
Ferber 

1974  The  Cost-Effectiveness  of  Using  the 
Diary  as  an  Instrument  for  Collecting 
Health  Data  in  Household  Surveys, 
Urbana,  Illinois:  Survey  Research  Lab- 
oratory, University  of  Illinois. 


11.  U.  S.  Bureau  of  the  Census 

1963  The  Current  Population  Survey— A  Re- 
port of  Methodology,  Technical  Paper 
No.  7,  Washington,  D.  C:  U.S.  Govern- 
ment Printing  Office. 

Summary  and  Conclusions 

The  topics  discussed  in  this  section  are  how  health 
survey  research  instruments  relate  to  respondent 
burden  and  differential  responses  of  subgroups  in  the 
population. 

Respondent  burden  is  the  level  of  demand  placed 
on  the  respondent  in  order  to  answer  questions  in  the 
survey  instrument.  Burden  is  generally  considered  to 
be  reduced  by  a  short  recall  period,  questions  about 
salient  and  infrequently  occurring  events,  questioning 
the  subject  personally  rather  than  a  proxy  respondent, 
and  limiting  the  length  and  complexity  of  the  survey 
instrument.  Various  techniques  were  considered  that 
might  allow  the  researcher  to  collect  the  needed  in- 
formation while  not  unduly  taxing  the  respondent. 

There  was  general  agreement  that  questionnaires 
that  worked  well  for  middle-class  respondents  may 
cause  problems  for  certain  minority  groups.  Thus, 
while  standardized  questionnaire  modules  are  highly 
desirable  for  purposes  of  standardization  and  compari- 
son, caution  must  be  exercised  in  applying  these 
modules  to  subgroups.  Respondent  characteristics  that 
must  be  taken  into  account  include  education,  age, 
ethnicity,  social  status,  and  tendency  toward  aquies- 
cence.  Ways  to  reduce  bias  caused  by  these  factors  were 
discussed. 

Needed  Research 

The  following  needs  for  additional  research  were 
mentioned  during  the  session.  The  listing  is  random, 
not  by  priority. 

1.  Judgments  on  length  of  recall  period  are  already 
supported  by  substantial  research,  but  additional 
work  is  needed  on  the  effects  of  varying  length  of 
the  recall  period  on  proxy  respondents  vs.  self- 
reports  and  on  old  vs.  young  respondents. 

2.  Although  diaries  are  being  used  and  tested  for  the 
collecting  of  health  data,  additional  work  is 
needed  to  see  how  they  work  with  differing  edu- 
cational levels.  Also  necessary  are  special  studies 
of  techniques  to  improve  respondents'  ability  to 
deal  with  more  complex  diary  forms. 


3.  More  systematic  measures  of  the  effectiveness  of 
aided  recall  procedures  and  the  use  of  records  are 
needed,  again  especially  with  differing  education 
levels. 

4.  Questionnaire  length  can  be  reduced  by  subsam, 
pling  abong  the  items  of  interest.  The  impact  of 
this  method  on  the  accuracy  and  completeness  of 
the  remaining  data  needs  to  be  measured.  Also, 
procedures  should  be  further  developed  for  esti- 
mating total  events  on  the  basis  of  a  sample. 

5.  A  similar  area  of  research  is  the  measurement  of 
improvement  in  quality  due  to  subsampling  mem- 
bers in  large  households.  Again,  procedures  are 
required  for  making  estimates  of  total  household 
experiences  and  expenditures  on  the  basis  of  a 
subsample. 

6.  Systematic  measures  of  the  effects  of  the  time  re- 
qu;red  to  complete  health  interviews  could  easily 
be  obtained  from  existing  data  by  measuring  qual- 
ity vs.  the  actual  length  of  the  interview.  Experi- 
mentally, one  could  measure  the  effects  of  time, 
length,  and  fatigue  by  varying  the  order  of  cer- 
tain sections  of  a  standard  questionnaire. 

7.  There  is  a  need  for  measuring  the  effects  of  vari- 
ous kinds  and  levels  of  incentives  on  cooperation 
and  quality  of  data  for  especially  complex  medical 
studies  and  for  special  subpopulations  for  whom 
the  interview  is  more  difficult.  These  incentives 
can  be  fiscal  or  other  kinds  of  goods  and  services. 

8.  Determining  equity  in  health  services  involves 
comparison  among  subgroups  of  the  population. 
Special  studies  for  dealing  with  data  collection 
from  various  ethnic  and  minority  groups  need  to 
be  supported. 

9.  Given  the  complexity  of  some  health  surveys,  ap- 
proaches to  reduce  respondent  burden  should  re- 
ceive special  attention.  Examples  include  bounded 
interviews,  matrix  sampling,  and  respondent  com- 
pensation. 

10.  With  the  proliferation  of  health  surveys  and 
attempts  to  use  them  for  health  planning  and 
prediction  of  utilization,  reliability  and  validity 
studies  of  questionnaire  items  are  becoming  in- 
creasingly important.  Funding  should  be  provided 
for  both  secondary  analysis  of  existing  data  and 
new  study  designs  to  measure  reliability  and 
validity. 
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Introduction 

This  conference  is  a  formal  recognition  of  the 
growing  interest  in  the  quality  of  data  in  health  sur- 
veys and  of  the  need  to  focus  research  attention  on 
improving  survey  methodology.  Historically  empha- 
sis has  been  on  interviewer  bias,  suggesting  that  the 
researcher  was  blameless  in  the  quality  of  reporting 
and  the  interviewer  was  at  fault. 

Over  time  has  come  the  realization  that  the  data 
collection  interview  is  a  complex  interactive  process 
between  two  people.  This  interaction  can  be  positive 
and  lead  to  accurate  and  complete  reporting,  or  it 
can  have  profoundly  negative  effects  that  bias  or  in- 
hibit accurate  reporting.  Recently,  emphasis  has  been 
focused  on  the  components  of  the  interaction,  and  re- 
search has  been  undertaken  to  use  the  interactive  force 
to  improve  reporting. 

Attention  has  also  been  given  to  the  growing 
awareness  that  the  task  given  the  respondent  is  fre- 
quently too  difficult  for  him  to  perform.  Sometimes  he 
does  not  possess  the  information  sought  or  he  is  un- 
willing to  report  it,  or  he  is  insufficiently  motivated  to 
put  forth  the  necessary  effort  to  produce  accurate  re- 
ports. Consent  to  be  interviewed  is  not  a  commitment 
candidly  to  answer  all  conceivable  questions  or  to 
expend  an  unlimited  amount  of  time  or  effort  on  the 
interviews.  Respondents  are  volunteers  who  control 
and  limit  this  participation.  The  limits  of  data  col- 
lection are  those  pieces  of  information  that  almost  all 
people  can  report  with  modest  effort  or  can  be  uni- 
versally induced  to  exert  the  effort  required  to  report. 

Session  I  focused  most  on  these  tasks.  This  session 
is  primarily  about  the  way  the  task  is  presented:  the 
interviewer  and  his  or  her  procedures. 

Telephone  Interviewing 

One  of  the  most  important  recent  changes  in  the 
interaction  process  of  the  interview  is  the  increased  use 
of  the  telephone.  Although  in  the  past  the  use  of  tele- 
phone interviewing  for  Government  sponsored  health- 
related  research  has  been  minimal,  it  has  become 
widely  used  in  recent  years. 


A  distinction  needs  to  be  made  between  three 
different  applications  of  telephone  interviewing: 

1.  For  re-interviews  of  persons  initially  inter- 
viewed in  person. 

2.  For  a  one-time  survey,  but  using  personal  fol- 
low-up procedures  to  include  those  who  either 
have  no  telephones  or  who  cannot  be  reached 
by  telephone. 

3.  For  a  one-time  survey  conducted  solely  by  tele- 
phone interviews. 

The  Federal  Government  has  used  the  telephone 
primarily  for  panel  survey  re-interviews.  Waksberg 
cited  the  use  of  telephones  for  re-interviews  in  the  Cur- 
rent Population  Surveys  by  the  Bureau  of  the  Census. 
Jabine  noted  the  use  of  telephones  with  a  panel  sur- 
vey of  Medicare  recipients.  The  Health  Interview 
Survey  is  experimenting  with  a  panel  to  obtain  health 
expenditure  data  by  telephone. 

When  a  sample  initially  interviewed  in  person  is 
re-interviewed  by  telephone,  there  are  no  special  sam- 
pling issues.  Gerson  reported  that  those  without  tele- 
phones or  for  whom  the  interview  is  particularly  com- 
plicated are  visited  by  a  personal  interviewer  in  the 
CPS. 

The  cost  savings  in  using  the  telephone  for  inter- 
views are  substantial  in  some  studies;  other  studies 
have  shown  only  minimal  cost  savings.  Moreover,  the 
use  of  the  telephone  has  been  carefully  evaluated  for 
both  the  CPS  and  Medicare  panels.  Comparisons  of 
response  rates,  the  reliability  of  answers  across  waves, 
and  the  distribution  of  responses  reveals  no  statistically 
significant  difference  between  the  success  of  telephone 
re-interviews  and  personal  re-interviews.  LoSciuto  vali- 
dated reports  of  data  of  birth  via  telephone  and  per- 
sonal interview  against  birth  certificates  and  found  no 
differences.  The  reliability  of  reporting  of  items  pre- 
viously reported  was  the  same  whether  the  re-interview 
was  in  person  or  by  telephone  (Institute  for  Survey 
Research:  1975). 

Telephone  interviews  supplemented  by  personal 
interviews  have  also  been  used  for  one-time  interviews. 
For  household-based  samples,  the  basic  problem  is  to 
obtain  a  significant  number  of  telephone  numbers. 
Hochstim  had  field  listers  attempt  to  make  contact 


with  households,  obtaining  telephone  numbers  when- 
ever possible  without  callbacks  (Hochstim:  1967) .  In 
Rhode  Island,  City  Directories  (such  as  those  pub- 
lished by  R.  L.  Polk)  were  used  to  obtain  telephone 
numbers  (Thornberry  and  Scott:  1973) .  A  combi- 
nation of  these  procedures  was  also  used  with  a  Ver- 
mont sample  (Fowler:  1973).  In  all  three  cases,  a 
personal  interview  was  carried  out  with  those  ad- 
dressees for  whom  an  interview  could  not  be  com- 
pleted by  telephone. 

The  percentage  of  all  interviews  completed  by 
telephone  varied  from  60  to  80  percent.  In  all  cases, 
significant  cost  savings  resulted  from  use  of  the  tele- 
phone. The  overall  response  rates  obtained  using  the 
combined  procedures  have  been  at  least  equal  to  those 
obtained  by  personal  interviewers.  Tully  reported  that 
his  evaluation  of  some  20  surveys  conducted  in  experi- 
mental health  areas  that  commonly  relied  on  this 
combined  strategy  generally  had  response  rates  in  the 
85  to  90  percent  range. 

Comparisons  were  carried  out  (Hochstim:  1967; 
Thornberry  and  Scott:  1973)  between  subsamples  in- 
terviewed entirely  in  person  and  comparable  sub- 
samples  interviewed  with  the  combined  telephone- 
personal  strategy.  Neither  study  found  any  significant 
differences  for  standard  health  items.  There  were  some 
differences  in  the  Hochstim  study  for  other  types  of 
items  included  in  the  survey. 

It  should  be  noted  that  household-based  samples 
permit  sending  an  advance  letter  to  respondents.  All 
the  studies  cited  above  in  which  response  rates  were 
in  the  85  to  95  percent  range  used  advance  letters. 
Dillman  said  he  had  evidence  that  response  rates  via 
telephone  were  significantly  better  if  an  advance  letter 
could  be  sent  (Dillman  and  Freg:  1974).  Colombotos 
and  Boisen  mentioned  similar  experiences. 

Two  questions  of  note  were  raised  about  the 
above  procedures.  First,  if  the  percentage  of  house- 
holds for  which  telephone  numbers  can  be  obtained 
is  sufficiently  low,  the  cost  savings  associated  with  tele- 
phone use  may  be  mitigated.  Second,  if  the  addressees 
for  which  personal  interviews  are  necessary  because  of 
the  absence  of  telephones  are  widely  scattered,  the 
costs  of  the  personal  interviews  may  be  d,-spropor- 
tionately  high,  again  reducing  the  cost  benefits  of 
telephone  usage. 

The  problem  of  the  coverage  of  a  sample  was  dis- 
cussed as  a  greater  problem  for  studies  that  rely  solely 
on  the  telephone,  without  personal  follow-up  proce- 
dures. The  alternatives  for  sampling  in  this  case  in- 
clude lists;  such  as  telephone  directories,  and  random 
digit  dialing.  Of  course,  good  lists,  such  as  those  of 
organizational  members,  are  no  problem.  However, 
telephone  directories  are  a  very  weak  source  from 
which  to  draw  a  population  sample.  Mobility— the 
problem  of  people  listed  in  the  directory  moving  and 
new  households  moving  into  an  area  that  are  not 
listed— is  one  major  concern.  People  who  request  that 


their  telephone  numbers  be  unlisted  or  unpublished 
are  another  omission.  Although  less  than  10  percent 
of  the  households  in  the  country  do  not  have  tele- 
phones, there  are  major  differences  by  race  and  in- 
come, with  the  rate  of  non-telephone-ownership  being 
as  high  as  25  to  30  percent  for  blacks.  The  latter  rate 
appears  to  be  primarily  a  function  of  income  and 
probably  applies  to  poor  whites  as  well.  In  any  case, 
there  are  biases  nationally  and  in  some  areas  if  non- 
telephone  owners  are  omitted  from  a  sample.  Waks- 
berg  cited  the  availability  of  not  fully  analyzed  1970 
census  data  that  would  be  helpful  in  describing  tele- 
phone ownership. 

Sudman  pointed  out  that  there  are  areas  in  which 
there  is  high  stability,  and  almost  universal  telephone 
ownership,  where  these  problems  are  minimal.  The 
mobility  problem  is  reduced  in  some  places  where  tele- 
phone directories  are  issued  every  three  months  and 
are  available  from  the  telephone  company  or  one  of 
the  directory  publishers  (such  as  R.  L.  Polk) .  Schu- 
man  cited  a  study  in  Cincinnati  (Klecka  and  Tuch- 
farber:  1974)  that  showed  little  difference  in  esti- 
mates from  a  household-based  personal  interview  sur- 
vey and  a  comparable  survey  using  random  digit  dial- 
ing. Random  digit  dialing  eliminates  the  problems  of 
incomplete  telephone  number  lists,  but,  of  course,  does 
not  provide  a  way  to  include  those  without  telephones. 

It  was  clear  that  there  are  circumstances  in  which 
each  of  the  procedures  suggested  as  a  way  to  obtain 
telephone  samples  can  be  useful.  However,  it  was  also 
clearly  agreed  that  the  researcher  must  be  aware  of 
who  will  be  omitted  by  the  sampling  procedure  he 
uses.  When  in  doubt  about  omission  biases,  it  is  prob- 
ably advisable  to  use  a  procedure  that  is  supplemented 
by  personal  interviews  to  include  those  without  tele- 
phones and/or  who  are  omitted  from  lists. 

Next  discussed  were  the  strengths  and  limitations 
of  telephone  procedures  for  the  kinds  of  questions  that 
can  be  asked.  It  is  clear  that  questions  requiring  visual 
materials  have  to  be  modified  for  telephone  use.  In- 
come, where  a  card  is  typically  presented  to  respond- 
ents with  a  large  number  of  detailed  categories,  was 
cited  as  an  example.  Sudman  reported  success  in 
achieving  comparable  results  on  the  telephone  by  pre- 
senf'ng  a  series  of  intervals.  He  felt  that  most  of  the 
objectives  achieved  by  visual  materials  or  cards  could 
be  achieved  on  the  telephone  with  imaginative  ques- 
tionnaire design.  Others  (notably  Dillman)  felt  that 
there  were  objectives  that  are  difficult  to  achieve  on 
the  telephone— for  example,  rank  ordering  of  lists. 

Colombotos  cited  an  example  of  a  question  that, 
initially  yielded  biased  information  from  doctors 
(Colombotos:  1969) .  When  asked  how  many  journals 
they  read,  doctors  gave  a  lower  number  on  the  tele- 
phone than  in  person.  But,  when  the  question  was 
changed  to  ask  them  to  list  the  journals  they  read  regu- 
larly, the  difference  between  telephone  and  personal 
interview  procedures  disappeared. 


Data  collection  that  benefits  from  interviewer  ob- 
servation clearly  suffers.  Haberman  felt  that  alcoholics 
were  more  accurately  identified  in  personal  interviews 
because  the  questionnaire  data  were  supplemented  by 
interviewer  observations.  When  respondents  are  asked 
to  check  such  things  as  medical  records,  checkbooks, 
or  labels  on  prescription  bottles  (White)  to  verify 
their  recall,  the  task  is  much  more  easily  accomplished 
via  personal  interview  than  by  telephone. 

The  question  was  raised  as  to  whether  or  not 
there  were  certain  subjects  that  should  be  avoided  on 
the  telephone.  The  consensus  seemed  to  be  that  there 
were  not.  Properly  done,  it  appeared  that  the  range 
of  topics  likely  to  be  covered  in  health-related  studies 
have  been  "successfully"  carried  out  on  the  telephone. 

Coombs  and  Freedman  (1964)  have  been  using  tele- 
phone procedures  for  follow-up  to  personal  inter- 
views, asking  questions  on  pregnancies,  family  plan- 
ning, and  related  topics.  They  report  that  the  tele- 
phone appears  to  be  a  method  as  satisfactory  as  the 
personal  interview  for  collecting  such  data.  They 
stress,  however,  that  these  interviews  followed  a  per- 
sonal interview.  Mooney,  Pollack  and  Corsa  (1964) 
report  similar  success  on  sensitive  topics  such  as 
menstruation. 

Sudman  validated  reporting  of  "threatening"  in- 
formation such  as  drunken  driving  and  going  into 
bankruptcy.  Although  there  was  considerable  error  in 
reporting  both  in  a  telephone  and  a  personal  inter- 
view, there  was  no  marked  difference  in  data  accuracy 
obtained  in  the  two  procedures. 

Cannell  reported  some  data  from  a  study  in  Kan- 
sas City  using  standard  mental  health  scales  that  sug- 
gested the  following  pattern:  for  those  times  that  were 
extremely  threatening  or  only  mildly  threatening, 
there  were  no  differences  between  telephone  and  per- 
sonal interview  responses.  For  items  that  appeared  to 
be  moderately  threatening,  however,  respondents 
seemed  more  likely  to  describe  themselves  in  a  positive 
way  on  the  telephone  than  in  a  personal  interview 
(Henson,  Roth  and  Cannell:  1974) .  Bradburn  cited 
work  using  his  "happiness"  item  that  showed  no  differ- 
ence; but  Cannell  thought  those  items  were  probably 
at  the  less  threatening  end  of  the  continuum  of  the 
Kansas  City  questionnaire.  He  concluded  that  there 
was  need  for  some  caution  and  further  research  as  we 
apply  telephone  interviewing  to  other  substantive 
areas. 

Overall,  with  proper  introduction  by  advance 
letter  or  with  an  initial  personal  contact,  there  is  no 
obvious  restriction  on  the  subjects  about  which  inter- 
views can  be  conducted  on  the  telephone— either 
through  concern  about  accuracy  or  response  rates. 
However,  the  data  cited  by  Cannell  and  the  lack  of 
comparative  research  for  one-time  surveys  on  topics 
other  than  fairly  basic  health  measures  suggest  a  need 
for  some  caution  and  some  further  research  with  re- 
spect to  the  application  of  telephone  to  the  more  sen- 
sitive topics. 


In  a  similar  vein,  some  people  expressed  concern 
about  how  long  a  telephone  interview  could  be.  Some 
organizations  try  to  limit  telephone  interview  length 
to  20  minutes  or  half  an  hour.  However,  other  re- 
searchers, such  as  Sudman,  reported  no  difficulty  in 
having  interviews  that  last  an  hour  or  more.  There 
seemed  to  be  no  basis  for  saying  that  the  restrictions 
on  length  of  telephone  interviews  were  any  greater 
than  those  on  personal  interviews;  although  it  is  clear 
that  most  users  tend  to  keep  telephone  interviews 
shorter  than  personal  interviews;  moreover,  there  was 
little  experience  with  telephone  interviews  that  lasted 
over  an  hour  except  with  special  populations. 

The  suggestion  that  telephone  interview  sched- 
ules, and  perhaps  the  training  of  interviewers,  should 
be  different  from  those  for  personal  interview  sched- 
ules was  discussed  in  some  detail.  There  was  a  dearth 
of  hard  evidence,  but  there  was  a  considerable  amount 
of  feeling  that  some  compensation  was  needed  for  the 
absence  of  visual  cues  in  the  interaction  between  in- 
terviewer and  respondent.  Dillman  reported  that  he 
has  interviewers  frequently  summarize  answers  and 
verify  them  with  respondents.  Bradburn  discussed  the 
theoretical  and  laboratory  work  of  Ingve  that  there 
are  "back  channel"  (feedback)  noises  people  make 
that  keep  conversations  going  that  should  be  standard- 
ized in  telephone  interviews.  This  seems  to  be  an  area 
in  which  some  comparative  research  is  needed.  At  the 
moment,  there  do  not  seem  to  be  any  clear  guidelines 
for  different  procedures  to  account  for  the  special  type 
of  interaction  on  the  telephone,  although  the  need  for 
such  procedures  seemed  likely  to  the  participants. 

Finally,  several  advantages  of  using  the  telephone 
in  the  administration  of  surveys  were  cited.  Following 
respondents  who  relocate  through  panel  studies  and 
coverage  of  respondents  widely  dispersed  geographi- 
cally is  greatly  facilitated  by  telephone.  Persons  diffi- 
cult to  find  at  home  can  sometimes  be  more  readily 
reached  by  telephone.  Persons  residing  in  places  where 
interviewers  are  reluctant  to  go  such  as  urban  high- 
crime  neighborhoods  or  high-rise  apartments  with  ex- 
tensive security  systems,  may  be  reached  more  success- 
fully in  a  telephone  survey.  Busy  professionals  or  elite 
respondents  may  be  contacted  more  successfully  and 
interviews  completed  at  their  convenience  via  tele- 
phone, as  Columbotos  has  shown. 

It  was  also  suggested  that  telephone  interviewing 
might  reduce  the  between-interviewer  differences. 
Visual  cues  that  might  produce  bias  are  not  factors. 
However,  Colombotos  finds  as  much  between  inter- 
viewer variance  on  the  telephone  as  in  person  in  his 
surveys  of  physicians.  Apparently,  that  potential  has 
not  yet  been  realized.  Nevertheless,  the  potential  for 
close  supervision  of  telephone  interviewers— monitor- 
ing their  actual  interaction  with  respondents  (making 
certain  not  to  violate  Federal  laws  regarding  tapping) 
and  the  potential  to  select  and  use  interviewers  with- 
out restrictions  on  age,  appearance,  car  ownership  and 
mobility— should  permit  higher  standards  for  inter- 


viewer  performance  to  be  achieved,  and  increase  stand- 
ardization of  techniques.  There  is  also  the  potential 
for  rapid  entry  of  computer  processing;  data  telephone 
interviewers  with  a  terminal  can  enter  answers  during 
the  interview. 

In  summary,  there  appears  to  be  consensus  that 
for  many  purposes  for  which  personal  interviews  have 
been  used,  telephone  techniques  can  produce  data  of 
equal  or  higher  quality,  often  at  lower  cost.  The  pos- 
sibility of  excluding  significant  segments  of  a  study 
population  by  exclusive  reliance  on  the  telephone 
needs  careful  attention  in  any  given  study  design. 
There  seem,  however,  to  be  few  if  any  bases  for  saying 
a  priori  that  the  telephone  is  a  less  satisfactory  data 
collection  modality  than  the  personal  interview.  The 
telephone  interview  has  the  potential  for  solving  some 
problems  that  have  plagued  personal  interview  proce- 
dures. There  is  need  for  further  research  as  its  appli- 
cations are  extended,  but  there  are  few  obvious  limits 
on  its  utility  at  the  moment. 

Compensation 

The  question  of  paying  respondents  or  compen- 
sating them  in  some  other  way  for  their  participation 
in  a  survey  has  been  debated  for  years.  In  this  confer- 
ence, the  focus  was  only  on  the  rate  of  cooperation  or 
the  quality  of  response.  Issues  such  as  the  appropriate- 
ness of  paying  low-income  respondents  for  reasons  of 
justice  may  be  very  important  in  certain  contexts  but 
are  not  considered  here. 

One  could  argue  that  compensation  could  in- 
crease respondent  commitment  to  a  task  and  relieve 
feelings  of  exploitation;  or  that  it  could  detract  from 
reporting  accuracy  by  making  people  feel  that  they  are 
being  bribed.  Sudman  has  some  data  indicating  that 
diary  keeping  may  be  somewhat  more  complete  when 
respondents  are  compensated;  but,  in  general,  there 
is  no  conclusive  evidence  on  this  point. 

There  is,  in  contrast,  a  good  deal  of  data  on  the 
value  of  compensation  to  increase  cooperation  in  data 
collection  efforts.  It  appears  that  when  respondents 
are  being  asked  to  accept  a  moderate  task,  within  the 
range  of  the  standard  one-time  interview  of  about  an 
hour,  compensation  does  not  have  a  significant  effect 
on  response  rate.  However,  when  the  positive  forces 
on  respondents  to  cooperate  are  fairly  low— as  in  a 
mail  survey— or  when  a  great  deal  is  being  asked  of 
respondents,  compensation  appears  to  be  helpful. 
Panel  studies  using  diary  techniques  benefit  from  com- 
pensation, particularly  in  the  third  and  fourth  waves 
(Sudman  and  Ferber) .  Success  with  payment  to  induce 
a  sample  of  young  adults  to  take  a  series  of  tests  that 
took  several  hours  was  also  reported  (Chromy  and 
Horvitz:  1974). 

Compensation  need  not  always  be  monetary.  Feed- 
back on  panel  results  has  proven  helpful  to  maintain 
panel  cooperation  (Sudman) ,  and  Greenberg  pointed 
out  that  providing  diagnostic  information  results  from 


health  examination  surveys  is  an  incentive  for  cooper- 
ation. 

Hagerman  reports  that  his  interviewers  liked 
being  able  to  compensate  respondents  in  a  study  of 
alcoholism,  although  it  was  not  clear  it  affected  the 
response  rate.  The  role  of  the  interviewer  was  also 
cited  by  LoSciuto:  when  interviewers  had  small  gifts 
from  which  respondents  could  choose,  slightly  higher 
response  rates  were  obtained. 

Knowing  at  what  level  to  compensate  respondents 
was  discussed  briefly.  Horvitz  cited  the  need  for  em- 
pirical testing  to  decide  the  amount  of  compensation 
to  offer  for  a  given  task.  This  issue  was  reinforced  by 
findings  in  psychological  laboratory  experiments,  in 
which  paying  too  much  actually  reduced  task  per- 
formance while  moderate  compensation  increased  the 
performance. 

In  general,  there  was  little  enthusiasm  for  com- 
pensating respondents  unless  unusual  demands  were 
made  of  them,  such  as  repeated  interviews,  lengthy 
interviews,  difficult  tasks,  etc.  Perhaps  the  sense  of  the 
conference  was  best  reflected  in  a  Bureau  of  the  Cen- 
sus experiment,  reported  by  Gerson.  When  trying  out 
a  diary  technique,  Census  paid  matched  groups  differ- 
ent amounts  of  money  and  compared  the  response 
rates  with  an  unpaid  control  group.  The  response 
rates  were  unsatisfactory  for  all  groups.  They  then 
proceeded  to  work  harder  on  training  their  inter- 
viewers, who  without  compensating  respondents  sub- 
sequently obtained  much  higher  response  rates  than 
any  of  the  preceeding  groups.  The  moral  may  be  that 
there  are  many  ways  to  enlist  respondent  cooperation. 
Although  it  may  seem  only  fair  to  compensate  those 
of  whom  a  great  deal  is  to  be  asked,  for  more  modest 
tasks  there  are  other  better  understood,  more  reliable, 
and  probably  more  effective  ways  to  enlist  cooper- 
ation; and  most  researchers  would  probably  do  best 
using  those. 

Response  Sets 

The  group  spent  some  time  discussing  "response 
set."  The  ideas  that  emerge  reflected  a  breadth  of 
ideas  and  concepts  ranging  from  a  description  of 
symptoms  to  more  causal  hypotheses.  In  contrast  to 
some  of  the  other  topics  discussed  at  the  session,  it  is 
clear  that  the  issues  are  complicated,  and  they  inter- 
act in  complex  ways.  The  discussion  was  illuminating 
even  though  many  issues  were  raised  and  no  firm  con- 
clusions were  reached.  Clearly,  this  an  area  on  which 
much  more  research  attention  needs  to  be  focused. 

Marquis'  statement  of  the  problem  is  that  it  is 
useful  to  think  of  a  response  set  such  as  acquiescence 
or  conformity  as  a  response  that  is  generated  by  some 
stimulus  other  than  the  question  content  itself. 

Such  extraneous  stimuli  include  a  wide  variety  of 
factors:  the  form  or  wording  of  the  question;  the 
difficulty  of  producing  an  adequate  response;  the  per- 
ception of,  or  the  expectations  of,  the  interviewer 


(especially  if  the  interviewer  is  seen  as  being  of  higher 
status)  ;  the  level  of  effort  the  respondent  is  willing  to 
exert;  and  the  social  desirability  of  alternatives  offered 
to  respondents. 

These  can  be  subsumed  under  two  general  head- 
ings. In  one  case,  it  appears  that  there  are  forces  that 
dominate  a  question,  leading  many  respondents  to 
respond  in  a  way  that  does  not  reflect  their  true 
response.  In  the  other  case,  the  respondent  either  has 
no  ready  response  or  the  task  of  generating  a  valid  re- 
sponse requires  greater  effort  than  he  is  willing  to  put 
forth.  In  the  latter  situation,  his  response  may  be 
based  on  some  extraneous  cue  from  the  question,  from 
the  interviewer,  or  from  some  other  aspect  of  the  situa- 
tion. 

The  discussion  differentiated  the  source  of  the  cue 
from  the  form  of  the  response  that  is  generated.  In 
some  cases  the  source  and  the  response  mode  are 
closely  linked,  and  in  others  they  are  quite  independ- 
ent. For  example,  if  the  interviewer  is  perceived  as 
having  more  education  than  the  respondent,  the  re- 
spondent may  be  reluctant  to  report  that  he  seldom 
reads  books.  If  an  abstract  question  is  not  understood, 
but  only  a  yes  or  no  response  is  required,  the  re- 
spondent may  merely  pick  one  answer  rather  than  ad- 
mit his  lack  of  understanding. 

The  following  is  a  list  of  the  various  factors  dis- 
cussed at  the  conference  that  have  been  found  to  in- 
fluence the  kinds  of  responses  that  are  obtained: 

Relative  status  of  interviewer  and  respondent 

Indicators  of  differences  in  status  may  influence 
the  respondent  and  the  interviewer.  The  status  differ- 
ence may  be  mediated  by  such  factors  as  ingratiation, 
resistance  or  conformity.  Indicators  of  status  may  in- 
clude education,  social  class,  or  income. 

Some  research  (Fowler:  1965;  Cannell,  Fowler 
and  Marquis:  1968)  has  found  clear  differences  in 
interviewer  behavior  toward  respondents  with  differ- 
ent levels  of  education,  particularly  in  their  inter- 
personal interaction  and  feelings.  Respondent  percep- 
tion of  the  interview  situation  and  behavior  in  the 
interview  also  differed  by  education.  For  example, 
when  the  respondent  was  of  lower  education  level  than 
the  interviewer,  respondent  behavior  was  more  ingra- 
tiating and  submissive.  In  experimental  work,  when 
the  interviewer  had  higher  education  than  the  re- 
spondent, feedback  resulted  in  better  respondent  per- 
formance. However,  when  the  respondent  was  of 
higher  education,  performance  did  not  improve,  and 
in  some  cases,  worsened  (Marquis,  Cannell,  Laurent: 
1972) .  One  hypothesis  to  explain  these  results  was  that 
the  relative  status  made  feedback  appropriate  and 
welcome  by  less  educated  respondents  and  inappro- 
priate and  resented  by  higher  educated  respondents. 

There  were  various  comments  on  the  effects  of 
these  status  differences  on  the  quality  of  data.  Weiss 


found  middle-class  interviewers  obtained  better  re- 
porting from  people  on  welfare  than  indigenous  inter- 
viewers (Weiss:  1968) .  Braduburn  cited  Hyman  re- 
sults that  showed  that  similarity  of  interviewer  and  re- 
spondent led  to  unwarranted  assumptions  by  the  inter- 
viewer that  he  understood  the  view  expressed  by 
respondents  (Hyman  1954)  .  An  anecdote  was  told 
about  past  ingratiating  behavior  of  low-income  Blacks 
in  the  South  with  respect  to  white  interviewers— the 
status  difference  produced  apparent  cooperation  but 
little  actual  cooperation  in  obtaining  good  data. 

Other  demographic  differences 

Interviewer  age  and  sex  have  been  thought  to  be 
important  to  the  quality  of  responses.  Age  or  sex 
matching  of  respondent  and  interviewer  has  been  ad- 
vocated, especially  for  topics  in  which  these  may  be 
expected  to  influence  the  response.  Bradburn  reported 
a  consistent  finding  that  young  interviewers  (most 
often  college  students)  were  especially  poor  inter- 
viewers. He  attributes  this,  however,  not  to  age  but 
inadequate  training  and  experience.  Other  comments 
also  suggest  that  the  major  variable  involved  is  the 
adequacy  of  the  training  the  interviewer  receives  rather 
than  the  age  characteristic.  LoSciuto,  Colombotos, 
Meyers  and  others  report  that  when  training  was  ade- 
quate, sex  or  age  of  interviewer  showed  no  effect, 
even  in  studies  in  which  differences  might  be  expected. 

One  of  the  most  extensive  discussions  dealt  with 
interviewers'  race.  The  results  of  that  discussion  are 
summarized  in  a  subsequent  section. 

The  form  of  the  question 

There  was  some  discussion  centering  around  the 
idea  that  some  question  forms,  such  as  agree-disagree, 
yes-no  or  unbalanced  format  may  be  distinctively 
likely  to  lead  to  particular  response  sets.  These  apply 
particularly  to  attitude  scales  commonly  used  in  soci- 
ological research.  It  was  suggested  that  acquiescence 
scales  can  be  built  into  questionnaires  to  identify 
people  most  likely  to  respond  with  certain  sets.  It  was 
also  noted  that  these  procedures  are  difficult  to  apply; 
moreover,  the  issue  has  somewhat  limited  relevance  to 
standard  health  surveys. 

Carr  reported  on  his  study  of  acquiescence  re- 
sponse with  the  Srole  Anomie  Scale  (Carr:  1971). 
There  was  considerable  discussion  of  acquiescense  and 
other  response  sets,  their  characteristics  and  causes. 
Ware  suggested  that  what  appears  to  be  acquiescense 
is  often  a  reflection  of  the  form  in  which  the  question 
is  asked,  especially  those  in  which  the  alternatives 
given  the  respondent  are  in  fact  not  alternatives.  He 
cited  other  forms  of  scale  items  that  may  give  rise  to 
acquiescent-appearing  responses  but  which  in  fact  re- 
flect other  factors.  It  was  generally  agreed  that  more 
needs  to  be  learned  about  response  sets— what  they 
are  and  what  factors  underlie  them. 


The  subject  of  the  question 

There  are  well-established  effects  of  the  affective 
component  of  questions.  Such  concepts  as  social  de- 
sirability and  threat  to  self-image  were  mentioned. 

Some  research  (Cannell,  Fisher  and  Bakker:  1965; 
and  Cannell  and  Fowler:  1965)  found  that  the  more 
threatening  the  reason  for  hospitalization  the  less 
likely  such  hospitalization  would  be  reported.  A  simi- 
lar pattern  was  found  in  the  likelihood  that  a  chronic 
condition  would  be  reported  (Madow:  1967) .  Sudman 
reported  validated  data  that  showed  under-reporting 
of  undesirable  events,  such  as  being  arrested  for  drunk- 
en driving,  and  over-reporting  of  desirable  character- 
istics, such  as  having  a  library  card. 

Monteiro  said  that  differences  in  male-female  re- 
ports of  disability  days  because  of  illness  may  reflect 
a  greater  reluctance  of  men  to  admit  they  were  ill. 
Greenberg  said  that  Sidney  Cobb's  data  indicate  that 
men  who  are  unemployed  are  more  likely  to  attribute 
disability  days  to  illness  rather  than  to  being  laid  off. 
Both  were  referring  to  unpublished  data.  Apparently, 
it  is  more  socially  desirable  to  report  illness  than  un- 
employment. In  these  and  in  previous  examples,  the 
result  of  social  desirability  forces  is  likely  to  be  biased 
data. 

Cues  from  the  interviewer 

The  interviewer  may  give  cues  that  affect  re- 
spondent behavior.  The  types  of  cues  that  may  be 
given  are  virtually  limitless— involving  verbal  and  non- 
verbal cues. 

Marquis  suggested  that  the  number  and  types  of 
probes  an  interviewer  uses,  and  his  pace,  can  com- 
municate to  respondents  certain  expectations  for  re- 
spondent behavior.  It  has  been  found  (Cannell,  Fowler 
and  Marquis:  1968)  as  established  in  independent  in- 
terviews with  interviewers  that  their  goals  of  accuracy 
and  speed  are  communicated  to  respondents.  Fuchs- 
berg  noted  that  the  speed  of  the  interview  is  one  of 
the  most  important  indicators  to  the  participants  that 
there  is  a  rush  to  complete  the  interview. 

Reaction  to  the  difficulty  of  the  task  or  question 

When  the  respondent  is  given  a  difficult  question 
or  one  that  requires  a  great  deal  of  effort  to  retrieve  the 
information  requested,  he  is  liable  to  take  a  short  cut 
that  produces  response  error.  The  researcher  is  often 
unaware  that  this  has  occurred,  because  an  acceptable, 
codable  answer  was  obtained.  Some  data  on  task  diffi- 
culty (Cannell,  Fisher  and  Bakker:  1965  and  Cannell 
and  Fowler:  1965)  show  that  hospital  stays  in  the  dis- 
tant past  and  those  that  had  low  impact  (because  they 
were  short  or  did  not  involve  surgery)  were  less  likely 
than  others  to  be  reported. 


Personality  or  cultural  patterns 

It  was  suggested  that  some  cultural  patterns  gen- 
erate response  tendencies.  Reeder,  for  example,  sug- 
gested that  Mexican-American  respondents  in  Los 
Angeles  may  have  a  yea-saying  tendency.  Others  have 
suggested  that  certain  ethnic  groups  tend  to  exaggerate 
or  minimize  their  health  problems. 

Racial  matching  of  interviewers  and  respondents 

The  issue  of  whether  more  valid  data  are  ob- 
tained in  interviews  in  which  the  interviewer  and  re- 
spondent are  of  the  same  race  has  been  the  subject  of 
much  discussion  over  the  past  several  years.  This  sub- 
ject is  of  sufficient  importance  in  survey  research  that 
we  have  separated  the  discussion  from  other  related 
topics. 

There  was  general  agreement  among  participants 
that  whether  the  race  of  the  interviewer  and  respon- 
dent was  the  same  or  different  had  no  discernible 
effect  on  the  data  reported  except  where  the  interview 
focused  on  racial  topics.  This  conclusion  was  stressed 
especially  by  Bradburn,  LoSciuto,  and  Schuman  who 
reported  their  own  and  other  research  to  substantiate 
this  position  (Schuman  and  Converse:  1971) .  The  evi- 
dence also  generally  supported  the  conclusion  that 
racial  matching  made  no  difference  in  response  rates 
(LoSciuto) .  However,  Schuman  found  that  black  in- 
terviewers did  obtain  somewhat  lower  response  rates 
in  some  white  neighborhoods.  It  was  suggested  that 
studies  showing  black/white  differences  might  well 
reflect  less  thorough  training  of  the  black  interviewers. 

From  this  discussion,  the  participants  generally 
agreed  that,  while  there  may  be  some  minor  effects 
from  racial  differences  between  interviewer  and  re- 
spondent, much  of  the  early  concern  over  the  issue  has 
been  dissipated.  Black  interviewers  are  generally  as 
successful,  both  in  response  rates  and  in  the  data  ob- 
tained, with  white  respondents  as  with  black.  Simi- 
larly, white  interviewers  can  interview  either  race  re- 
spondents. 

When  the  topic  of  the  interview  is  racially  related, 
however,  significant  effects  are  reported  (Schuman 
and  Converse:  1971).  Interestingly  enough,  it  appears 
that  matching  the  race  of  the  participants  is  not 
always  best.  Greenberg  reports  the  feeling  of  inter- 
viewers in  a  study  covering  sensitive  topics  of  family 
planning.  The  interviewers  felt  that  black  interviewers 
obtained  less  accurate  reports  from  black  respondents 
because  of  their  concern  their  responses  might  be 
spread  to  other  members  of  the  black  community  by 
the  black  interviewers.  Weiss  provides  quantitative 
findings  consistent  with  these  impressions  (Weiss: 
1968)  . 

The  conference  hoped  that  this  discussion  may 
help  lay  to  rest  the  recurring  topic  of  race  matching 
of  interviewers  and  interviewees.  That  race  may  re- 
flect other  characteristics  that  will  affect  interviewer 
results  (education,  sex,  age,  socioeconomic  status,  and 


most  important,  training  and  experience)  should  not 
be  overlooked. 

Conclusion 

There  are  other  points  that  could  have  been 
raised  that  can  or  do  affect  the  responses  obtained. 
Marquis  pointed  out  that  we  should  not  only  con- 
sider interviewer  status  or  respect  as  a  source  of  bias, 
but  also  the  possible  positive  uses  of  interviewer  in- 
fluences. This  was  an  appropriate  follow-up  to  his 
earlier  characterization  of  response  bias.  In  fact,  most 
of  the  factors  discussed  in  the  section  influence  re- 
sponses. In  some  cases  we  know  how  to  minimize  their 
effects  by  manipulating  question  wording  or  objec- 
tives. However,  in  many  cases  the  possibility  that  an 
extraneous  factor  may  influence  a  response  cannot  be 
eliminated.  Rather  than  setting  out  on  the  almost  end- 
less task  to  eliminate  all  possible  biasing  factors,  the 
solution  would  seem  to  be  to  strengthen  and  structure 
those  forces  that  lead  to  the  desired  outcome,  namely, 
accurately  answering  the  questions  asked.  The  goal 
should  probably  be  to  make  giving  accurate  answers 
the  dominate  force  in  the  interview.  Some  efforts  to 
accomplish  that  goal  are  discussed  in  the  final  section 
of  this  session. 

Methods  for  obtaining  respondent  cooperation 

The  topic  of  pay  to  respondents  as  a  means  to 
motivate  good  interview  performance  was  discussed 
earlier.  One  of  the  criticisms  of  payment  was  that  it 
was  unclear  what  meaning  the  payment  would  have  to 
respondents.  No  doubt  the  meaning  would  depend  on 
the  respondent's  circumstances  and  the  way  the  com- 
pensation is  presented.  A  payment  for  time  spent  may 
have  a  positive  effect  whereas  a  perception  of  it  as  a 
"gift"  may  appear  as  a  reward  for  giving  the  inter- 
viewer the  answers  he  wants. 

The  researcher's  goals,  of  course,  are  to  induce  the 
respondent  to  accept  the  task  of  being  a  good  respond- 
ent—not simply  going  through  the  motions  of  giving 
some  answers,  but  to  attempt  insofar  as  possible  to 
give  the  answers  that  meet  the  researcher's  objectives. 
Throughout  the  conference,  and  in  this  session  in  par- 
ticular, the  issue  of  how  to  achieve  this  was  discussed. 
The  desire  to  contribute  to  research,  the  relationship 
to  the  interviewer,  and  payment  are  examples  of  forces 
that  operate  on  respondents  in  ways  that  may  help  to 
achieve  this  goal.  However,  the  meaning  is  unstruc- 
tured and  the  ways  respondents  react  to  them  can  vary 
widely. 

Overall,  though,  we  know  that  survey  data  are 
the  result  of  a  complex  interaction  among  inter- 
viewers, respondents,  question  content,  and  interview 
procedures.  We  have  to  take  respondents  as  they  occur 
in  the  population,  and  question  objectives  may  be 
difficult  to  change,  although  they  can  often  be  modi- 
fied to  reflect  our  understanding  of  what  respondents 
are  in  fact  willing  and  able  to  do.  The  main  things 
we  can  change,  however,  are  the  interviewers'  be- 


haviors and  the  procedures  they  are  asked  to  work 
with. 

Cannell  presented  some  findings  of  a  research 
program  designed  to  harness  and  structure  the  po- 
tential forces  that  can  be  brought  to  bear  on  respond- 
ents in  a  way  that  is  much  more  directly  related  to 
achieving  the  research  goals.  The  studies  attempt  to 
accomplish  this  by  focusing  on  the  information  being 
reported  and  on  the  reporting  process  rather  than  on 
any  personal  affective  response  to  the  respondent. 

Rapport  may  be  (as  Hyman  noted  years  ago)  not 
only  unnecessary,  but  deleterious  to  the  interviewer 
role.  A  more  professional,  task-oriented  interaction 
may  be  better  and  more  productive  of  better  reporting 
(also  see  Dohrenwend  et  al.:  1968) . 

Woolsey  and  Gerson  commented  that  Census  in- 
terviewers are  characterized  by  a  very  businesslike 
approach.  Gerson  believes  this  also  has  the  advantage 
of  reducing  interview  time,  and  thus,  saving  money. 

Early  studies  examining  the  interaction  patterns 
showed  that  much  interviewer  behavior  was  spon- 
taneous; that  is,  was  not  part  of  his  training  or  in- 
structions but  reflected  particular  interpersonal  needs 
of  the  moment  or  the  interviewer's  interaction  with 
others.  These  idiosyncratic  behaviors  were  individual 
in  nature,  varied  from  interviewer  to  interviewer,  and 
were  outside  the  control  of  the  researcher.  The  feed- 
back techniques  used  were  especially  individual  in 
nature  and  appeared  to  be  a  major  potential  source 
of  interviewer  variation  in  the  completeness  and  ac- 
curacy of  obtained  responses. 

Experiments  were  conducted  and  designed  to  con- 
trol much  of  the  interaction,  especially  the  important 
feedback  techniques.  Th;s  was  done  by  making  the 
questions  more  self-contained  and  by  specifying  feed- 
back statements  for  the  interviewer  to  use.  These  state- 
ments focused  explicitly  on  the  process  and  content 
of  the  respondent  activities  of  answering  questions 
rather  than  on  the  rapport  or  interpersonal  affective 
aspects  of  the  interview. 

Essentially  three  different  kinds  of  strategies  have 
been  used.  One  of  these  involves  giving  respondents 
detailed  instructions  about  what  they  are  supposed 
to  do— not  just  on  individual  questions,  but  for  the 
interview  as  a  whole.  Instructions  stressed  the  impor- 
tance of  accuracy,  of  reporting  even  minor  events,  and 
of  encouraging  the  respondent  to  work  hard  to  recall 
distant  or  insignificant  events.  Such  instructions  ap- 
pear to  improve  reporting  significantly. 

A  second  strategy  is  designed  to  clarify  the  re- 
spondent's commitment  to  accuracy  by  making  it 
exph'cit.  When  a  respondent  agrees  to  an  interview, 
typically  he  or  she  probably  does  not  know  details  of 
what  has  been  agreed  to.  Some  respondents  are  prob- 
ably agreeing  to  go  through  the  motions,  while  others 
are  agreeing  to  provide  accurate  information.  Inter- 
viewers may  communicate  differently  about  what  they 
perceive  the  agreement  to  be.  No  doubt,  some  inter- 
viewers communicate  the  fact  that  all  they  want  is 


some  answers  to  fill  some  blanks  in  the  questionnaire. 

In  an  experimental  study  of  the  effects  of  com- 
mitment, a  written  form  was  presented  to  respondents 
for  their  signature.  The  form  stated  that  by  signing 
the  document,  respondents  were  agreeing  to  give  com- 
plete and  accurate  information  to  the  best  of  their 
ability.  They  were  told  that  if  they  could  not  make 
this  commitment,  they  should  not  continue  with  the 
interview.  The  interviewer  also  signed  the  same  form, 
committing  himself  in  writing  to  the  confidentiality 
of  the  data. 

This  procedure  accomplishes  at  least  three  goals. 
It  has  the  potential  to  eliminate  the  respondent  that 
is  a  chronic  problem  for  survey  researchers:  the  one 
that  appears  to  accept  the  task  but  in  fact  does  not. 
Second,  it  clarifies  and  standardizes  for  respondents 
what  they  are  agreeing  to  do.  Third,  by  virtue  of  sign- 
ing, the  parties  are  in  essence  making  a  commitment, 
which  becomes  an  additional  force  on  them  to  honor 
it. 

The  results  were  that  only  a  small  percentage  of 
respondents  refused  to  sign  the  form,  and  the  quality 
of  reporting  improved  significantly. 

Sudman  reported  a  similar  procedure  used  in  a 
pretest  of  collection  of  income  data.  In  this  case,  he 
used  a  lengthy  introduction,  that  noted  that  the  sub- 
ject of  income  is  a  sensitive  one;  that  some  people  have 
fears  about  IRS  or  possible  misuse  of  the  information. 
It  assured  respondents  that  answers  will  not  be  mis- 
used and  advised  respondents  that  they  did  have  to 
answer  questions.  However,  if  they  choose  to  answer 
the  question,  they  were  requested  to  do  so  accurately. 
Essentially,  it  tells  the  respondent  that  the  researchers 
would  rather  have  no  answer  at  all  than  a  poor  one; 
and  that  by  answering  the  question,  a  commitment 
is  being  made  to  do  it  to  the  best  of  the  respondent's 
ability.  Sudman  reports  that  the  rate  of  refusal  to 
answer  is  lower  with  this  instruction  than  without  it, 
but  that  the  quality  of  information  appears  to  be  con- 
siderably better  than  with  the  standard  approach.  He 
is  presently  conducting  a  field  experiment  that  will 
provide  data  on  this  way  of  asking  about  income. 

A  third  strategy  studied  by  Cannell  is  the  use  of 
of  positive  and  negative  feedback  by  the  interviewer, 
depending  on  the  respondent's  answer. 

In  effect,  the  procedure  rewards  "good"  perform- 
ance and  reacts  negatively  to  inadequate  performance. 
For  example,  a  respondent  was  asked  whether  she  had 
been  sick  or  not  feeling  well  at  any  time  during  the 
past  two  weeks.  If  illness  was  reported,  this  positive 
feedback  was  given— "That's  the  kind  of  information 
we  need  for  this  study."  If  the  respondent  gave  a  rapid 
"no"  response,  the  feedback  used  was  something  like: 
"You  answered  quickly.  Sometimes  it's  hard  to  remem- 
ber these  things.  If  you  think  about  it  again  you  may 
remember  someth-'ng."  This  strategy  is  another  at- 
tempt to  clarify  what  is  expected  of  the  respondent. 
Moreover,  it  makes  it  clear  that  the  interviewer  ex- 


pects a  certain  level  of  quality  and  is  not  a  passive 
person  who  will  accept  any  level  of  performance.  We 
know  that  interviewers  communicate  their  expecta- 
tions (Fowler:  1965)  ;  but  they  communicate  different 
ones,  in  an  unstandardized  way.  One  important  goal 
of  the  strategy  is  to  standardize  the  expectations  inter- 
viewers communicate  and  the  way  they  communicate 
them.  The  result  of  the  procedure  will  increase  sig- 
nificantly the  reporting  of  events  or  behaviors  known 
to  be  commonly  underreported. 

An  analogous  experiment  reported  by  Sudman 
also  used  directive  probes.  For  those  who  denied  ever 
using  marijuana,  for  example,  the  probe  was,  "Not 
even  once?" 

These  studies  have  not  been  fully  analyzed  and  the 
full  potential  of  these  strategies  has  not  yet  been  eval- 
uated. However,  they  mark  an  important  and  promis- 
ing avenue  to  improve  the  quality  of  data  collection. 
Interviewing  has  its  roots  in  the  nondirective  clinical 
interview.  Being  nondirective  in  terms  of  the  content 
of  the  answers  is,  of  course,  essential;  but  being  non- 
directive  with  respect  to  the  quality  of  the  answers  is 
not.  Interviewer  variation  has  consistently  been  found 
to  be  very  large.  Examination  of  the  interviewer- 
respondent  interaction  shows  that  the  majority  of  the 
discussion  between  interviewer  and  respondent  is  not 
standardized  and  it  deals  with  something  other  than 
the  question  and  answer  process. 

One  important  aspect  of  these  experimental  efforts 
is  to  minimize  interviewer  behavior  that  is  not  stand- 
ardized. By  structuring  transitions,  instructions,  and 
responses  to  answers,  some  of  the  sources  of  between- 
interviewer  variation  can  be  reduced.  Moreover,  the 
structuring  is  in  the  d'rection  of  clarifying  respondent 
expectations  and  setting  some  standards  for  them. 

In  the  conference  discussion,  it  was  pointed  out 
that  these  procedures  may  lead  to  overreporting. 
Clearly,  reinforcement  can  most  easily  be  used  when 
the  direction  of  bias  or  error  is  known  in  advance.  It 
is  more  difficult  to  apply  when  there  is  not  a  clear 
criterion  for  interviewers  to  use.  However,  the  general 
instructions  and  strategies  for  enlisting  commitment 
to  accuracy  can  be  applied  to  all  kinds  of  reporting. 

The  tests  of  these  procedures  thus  far  have  been 
limited  to  increasing  the  reporting  of  events  or  condi- 
tions commonly  underreported.  The  criterion  has 
been  that  more  reporting  is  better  reporting.  For  the 
items  used,  this  assumption  is  well  based  in  validity 
studies.  However,  these  strategies  require  more  devel- 
opment and  testing.  In  particular,  validation  is  needed 
with  better  criteria  for  accuracy  and  with  a  wider 
range  of  health  items. 

The  training  and  procedures  of  interviewers  have 
remained  essentially  unchanged  for  thirty  years.  Look- 
ing at  the  reporf'ng  problems  we  have  at  the  moment 
argues  that  we  sorely  need  to  do  a  better  job  of  elicit- 
ing high-quality  performance  from  respondents.  Work 
to  date  suggests  that  we  can  do  better  through  struc- 
turing interviewer  behavior  and  setting  clear  goals  for 


respondents.  At  this  time,  this  is  perhaps  the  most 
promising  area  of  research  to  improve  survey  data 
collection  methods. 
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Summary  and  Conclusions 

This  session  of  the  conference  focused  on  the 
interaction  between  the  interviewer  and  respondent, 
and  several  positive  and  negative  results  from  these 
interactions  were  discussed.  That  the  task  given  a 
respondent  is  often  too  difficult  for  him  to  perform 
adequately  and  willingly  was  given  attention  during 
this  session  as  well  as  in  the  one  that  preceded  it. 

Telephone  interviewing  received  special  attention 
since  the  method  of  data  collection  is  being  utilized 
increasingly  because  of  its  lower  costs  compared  with 


personal  interviews.  The  danger  that  some  segments 
of  the  population  will  be  excluded  from  the  sample 
if  the  telephone  is  used  as  the  exclusive  method  must 
be  given  careful  consideration. 

There  is  considerable  evidence  that  the  quality  of 
the  data  from  telephone  interviews  is  comparable  to 
personal  interviews,  although  more  evidence  on  this 
issue  needs  to  be  obtained. 

Compensating  respondents  evoked  considerable 
discussion,  the  general  conclusion  being  that  there 
was  no  good  evidence  that  financial  or  other  rewards 
improved  the  response  rates,  and  there  was  some  con- 
cern that  such  techniques  might,  in  fact,  introduce 
biasing  forces.  It  was  generally  agreed  that  unless 
special  demands  were  made  on  the  respondent,  in 
terms  of  time  or  work  load  (such  as  keeping  exten- 
sive diaries) ,  compensation  should  not  be  made. 

Some  of  the  complex  issues  of  response  sets  were 
discussed.  Since  this  is  both  complex  and  diverse  in 
nature,  no  firm  conclusions  were  reached.  Rather,  the 
nature  of  the  discussion  and  the  lack  of  adequate  re- 
search data  suggest  that  the  area  deserved  much  more 
research  attention.  Diverse  topics  were  considered, 
ranging  from  the  effects  of  relative  status  between 
interviewer  and  respondent,  the  form  of  the  question, 
the  nature  of  the  subject  of  the  question,  and  per- 
sonality and  cultural  response  patterns.  On  racial 
matching  of  interviewers  and  respondents,  the  group 
felt  that  the  evidence  was  sufficiently  clear  to  con- 
clude that  when  the  topic  of  the  survey  or  individual 
questions  were  racially  related,  racial  matching  was 
important.  For  non-racial  issues,  racial  matching  was 
unimportant  and  not  a  potential  source  of  bias. 

The  session  concluded  with  descriptions  of  cur- 
rent research  designed  to  improve  the  quality  of  re- 
search data  by  achieving  better  respondent  perform- 
ance. Several  studies  were  described  in  which  inter- 
viewer techniques,  question  designs,  and  other  pro- 
cedures were  used  to  improve  reporting.  Several  such 
techniques  had  significant  effects  on  respondent  per- 
formance. The  general  conclusion  was  that  research 
on  respondent  performance  and  techniques  for  im- 
proving it  was  particularly  promising  and  should  be 
encouraged. 

Needed  Research 

There  are  five  areas  of  research  that  emerge  from, 
or  are  directly  related  to,  the  discussion  in  this  section. 
1.  The  increased  use  of  the  telephone  for  Govern- 
ment-sponsored research  is  clearly  a  new  and  im- 
portant development.  There  are  three  clear  areas 
of  uncertainty  regarding  the  use  of  telephone  in- 
terviews: 

a)  How  to  use  telephones  as  the  basic  data  collec- 
tion modality  for  a  one-time  survey  without 
serious  concerns  about  the  quality  of  the  sam- 
ple. While  random  digit  dialing  and  household 
based  samples  using  combinations  of  telephone 
and  personal  interviews  are  promising  and  ap- 


pear  to  be  successful,  there  is  a  lack  of  clear 
guidelines  for  when  and  how  telephones  can 
be  used.  The  issues  include  when— for  which 
populations,  under  what  conditions— replying 
on  the  telephone  alone  is  biasing,  what  kinds 
of  supplemental  procedures  can  be  used  to 
avoid  these  biases,  and  the  circumstances  under 
which  combinations  of  telephone  and  other  pro- 
cedures are  not  cost  effective. 

b)  There  is  a  need  for  research  on  what  special 
interviewer  techniques  or  behaviors  are  re- 
quired for  telephone  interviewing. 

c)  There  needs  to  be  more  extensive  testing  of 
the  effect  of  telephone  procedures  on  the  qual- 
ity of  data.  There  have  been  no  conclusive  ex- 
periments in  which  the  quality  of  telephone 
data  has  been  independently  validated.  Al- 
though aggregate  data  on  the  telephone  ap- 
proximate personal  interview  data  for  standard 
health  interview  items,  the  application  of  tele- 
phone procedures  to  a  broad  range  of  subject 
matters— particularly  those  whose  social  desir- 
ability bias  is  an  issue— has  not  been  fully 
evaluated. 

The  role  of  interviewer  behavior  on  interview  re- 
sults is  a  critical  area  for  research.  We  know  there 
is  a  great  variation  by  interviewer  without  fully 
understanding  the  reasons  or  what  to  do  about  it. 
There  is  some  promising  preliminary  work  that 
leads  one  to  suspect  that  interviewer  variation  can 
be  decreased  and  interviewers  can  be  used  to 
greatly  improve  respondent  performance.  However, 
this  work  needs  to  be  much  more  developed.  We 
are  not  close  to  being  able  to  prescribe  specific 
interviewer  procedures. 

Although  there  is  uncertainty  about  what  the  in- 
terviewers conduct  should  be,  there  are  many  pro- 
cedures that  are  well  documented  as  being  essen- 
tial to  produce  reliable  data.  However,  there  is 
great  variability  in  the  training  and  monitoring 
procedures  used  to  implement  these  procedures. 


The  importance  and  content  of  training,  the  ef- 
ficacy of  different  amounts  or  kinds  of  training, 
and  efficacy  of  different  quality  control  strategies 
have  not  been  evaluated  adequately.  Such  evalua- 
tion is  critical  as  we  attempt  to  set  guidelines  for 
better  methodology. 

4.  Related  to  Number  3  above  is  the  problem  of 
adequately  reporting  on  the  quality  of  data  col- 
lection. Researchers  report  sampling  error  esti- 
mates as  if  they  were  the  only  or  primary  source 
of  error  in  survey  data.  They  report  response  rates. 
However,  the  way  interviewers  do  their  job  has 
been  shown  to  account  for  50  percent  of  variance 
in  the  data's  accuracy.  It  is  highly  desirable  to  de- 
velop standard  indices  for  the  quality  of  data  col- 
lection. This  means  determining  what  the  relation- 
ship is  between  various  indices  of  interviewer  be- 
havior and  the  quality  of  data  that  results. 

5.  Of  all  the  demographic  issues  raised,  perhaps  the 
most  pervasive  is  the  impact  of  status  differences 
between  interviewer  and  respondent.  This  stems, 
in  large  part,  from  the  fact  that  most  established 
interviewing  staffs  are  well  educated— at  least  high 
school  graduates,  and  more  commonly  college  edu- 
cated. The  consequences  of  the  fact  that  in  most 
survey  projects  a  substantial  number  of  interviews 
are  conducted  by  interviewers  who  are  obviously 
of  higher  status  than  their  respondents  is  not  well 
documented.  Furthermore,  because  that  is  likely 
to  remain  the  case,  we  need  research  on  how  best 
to  deal  with  this  situation;  we  need  to  know  in 
what  circumstances  is  status  difference  between  in- 
terviewer and  respondent  important;  and  what 
interviewers  can  do  to  counteract  the  negative  con- 
sequences. 

6.  Additional  research  on  "Respondent  Compensa- 

tion" is  needed,  especially  on  varying  levels  and 
types  of  compensation.  Current  evidence  on  this 
subject  is  mixed,  i.e.,  compensation  payments 
neither  improve  the  response  rate  and  the  quality 
of  the  data,  nor  do  they  have  a  deleterious  effect. 
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Introduction 

At  the  outset,  the  Chairman  explained  the  origin 
of  this  particular  session  and  summarized  the  reasons 
for  dividing  the  time  as  shown  in  the  outline  of  the 
Conference  Agenda.  The  idea  started  out  as  a  session 
that  would  be  devoted  to  the  problems  and  methods 
of  collecting  data  on  highly  sensitive  and/or  confiden- 
tial questions.  The  purpose  was  to  promote  a  discus- 
sion of  randomized  response  and  other  survey  methods 
that  are  useful  in  minimizing  mean  square  error, 
where  the  latter  term  is  defined  as  the  sum  of  the 
square  of  the  bias  plus  the  sampling  variance. 

The  bias  refers  to  the  error  in  estimating  the 
true  mean  of  a  distribution  of  a  continuous  variable 
or  the  proportion  of  some  attribute  measured  on  a 
dichotomous  or  multichotomous  scale.  The  reason  for 
the  bias  involved  here  is  the  lack  of  cooperation  that 
results  in  nonresponse  or  untruthful  reporting  by 
those  who  appear  to  participate  fully  by  providing  a 
reasonable,  but  not  completely  correct,  response.  It 
is  worth  noting  that  the  bias  not  only  affects  the 
measurement  of  central  tendency,  but  that  frequently 
the  impact  is  even  greater  upon  the  magnitude  of  the 
mean  square  error.  There  is  no  simple  or  direct  way 
to  measure  bias;  otherwise,  one  could  adjust  esti- 
mates of  mean  and  standard  deviation  by  correcting 
for  such  bias.  The  goal  is  to  eliminate  or  reduce  to  a 
minimum  the  effect  of  this  bias  in  reporting.  Thus, 
the  topic  of  sensitive  questions,  listed  as  the  third 
item  in  the  Agenda,  was  basically  the  starting  point. 

As  the  conference  planners  considered  the  prob- 
lem of  bias,  it  became  evident  that  more  than  simply 
sensitive  questions  were  involved  since  bias  may  re- 
sult from  other  sources  as  well.  Thus,  the  scope  of  the 
session  was  broadened  to  consider  validity  in  general. 
In  the  connection,  the  usefulness  of  checks  for  ac- 
curacy of  response  by  utilizing  simultaneous  record 
checks  when  data  from  several  sources  are  collated 
and  compared.  Data  should  not  be  accepted  at  face 
value  and  supplemental  sources  of  information  should 
be  used  whenever  records  can  be  checked  in  this  man- 
ner. Further  consideration  of  record  checks  led  to  the 
realization  that  the  process  would  sometimes  involve 
the  problem  of  matching  and  linkage  of  records.  Cur- 


rently, these  are  most  often  carried  out  by  computers 
using  material  from  data  banks  and  other  large  data 
registers.  The  problems  of  matching  and  techniques 
of  linkage  comprise  a  whole  subject  area  requiring 
full  attention  in  itself,  but  interest  was  primarily  in 
the  uses  of  linkage  as  a  means  to  verify  data.  This  re- 
sulted, therefore,  in  the  second  item  listed  on  the 
agenda  for  this  session. 

In  any  discussion  of  record  linkage,  the  question 
arises  as  to  whether  such  matching  may  be  a  violation 
of  the  confidentiality  of  the  data.  The  use  of  a  record 
for  a  purpose  not  intended  when  the  information  was 
originally  collected  raises  invasion  of  privacy  ques- 
tions. Thus,  before  realizing  it,  the  problem  of  obtain- 
ing answers  to  sensitive  and  potentially  stigmatizing 
questions  had  broadened  into  the  four  areas  in  the 
agenda,  viz.,  confidentiality  and  the  invasion  of  pri- 
vacy problem  (IPP),  record  linkage  to  establish  val- 
idity, methods  of  obtaining  answers  to  sensitive  ques- 
tions, and  the  use  of  simultaneous  record  checks  to 
ascertain  validity.  Discussion  is  planned  in  that  se- 
quence. 

Confidentiality  and  Invasion  of  Privacy 

Before  inviting  participation  from  others,  the 
Chairman  introduced  the  subject  of  confidentiality 
and  invasion  of  privacy.  It  seems  ironic  that  at  a  time 
when  society  needs  more  data  on  the  personal  and 
family  life  of  its  citizens  in  order  to  plan,  administer, 
and  evaluate  social  programs  of  all  kinds  in  health, 
education,  and  welfare,  there  is  a  growing  tendency 
to  view  the  collection  of  such  data  as  an  evil  in  itself. 
Undoubtedly,  there  have  been  instances  of  abuse  and 
misuse  of  personal  data  files  by  persons  doing  so 
either  intentionally  or  without  realizing  it.  Thus, 
data  on  race  in  the  field  of  health  have  come  under 
criticism  (Terris:  1973)  and  then  later  were  defended 
(Greenberg  and  Cassel:  1974).  At  any  rate,  the  po- 
tential for  abuse  and  misuse  of  personal  and  con- 
fidential data  always  exists;  therefore,  there  has  been 
a  growing  tendency  to  prevent  the  collection  of  such 
data  or  to  withhold  its  use  to  a  narrow  or  limited 
audience.  In  some  context,  any  demographic  variable 
involving  age,  race,  sex,  place  of  origin,  education, 


marital  status,  and  others  can  be  viewed  by  some  as 
confidential  information  and  potentially  embarrassing. 
Thus,  although  statisticians  and  survey  researchers 
have  been  faced  with  problems  of  confidentiality  ever 
since  the  first  census  in  1790,  the  individuals  and 
agencies  involved  were  aware  of  the  highly  sensitive 
nature  of  their  data  files  and  took  unusual  precautions 
to  protect  individuals  and  groups.  The  data  were 
treated  in  a  scientific  manner  and  caution  was  exer- 
cised not  to  allow  commercialization  of  the  results 
or  embarrassment  and  risk  to  any  individual  when 
data  were  disaggregated  into  minute  components. 

The  advent  of  computers,  credit  cards,  and  the 
growth  of  large  data  banks  on  all  types  of  persons 
and  their  characteristics  began  to  chip  away  at  the 
sanctity  of  the  data  and  the  methods  previously  con- 
sidered adequate  as  safeguards.  Within  the  last  few 
years  there  has  emerged  a  curious  alliance  of  two 
large  groups  anxious  to  restrict  the  compilation  and 
use  of  such  data  files.  On  the  other  hand,  there  are 
liberals  who  are  concerned  with  the  protection  of 
civil  liberties  and  the  fear  of  big  brother  in  govern- 
ment repressing  the  rights  of  individuals.  Data  files 
of  the  FBI  and  CIA  may  justify  this  concern.  On  the 
other  side  of  the  spectrum  are  those  who  would  gen- 
erally be  considered  politically  conservative  and  who 
have  been  opposed  to  any  data  collection  procedure 
that  might  impinge  upon  their  freedom  to  act  for 
their  own  interests.  Thus,  questions  about  family 
planning  were  early  viewed  as  an  invasion  of  the 
privacy  of  the  bedroom.  The  interesting  case  that 
occurred  prior  to  the  1970  census  when  a  Congress- 
man from  Ohio  tried  to  limit  the  type  of  data  col- 
lected by  the  Bureau  of  the  Census  is  another  illus- 
tration of  this  concern.  The  item  that  triggered  this 
reaction  was  the  proposed  question  about  a  family's 
bathroom  and  whether  it  was  private  or  shared  by 
others. 

A  recent  piece  of  legislation  enacted  in  1974 
bears  directly  on  the  question  of  confidentiality  and 
invasion  of  privacy  in  data  collection  efforts.  It  is 
called  the  Privacy  Act  and  became  effective  on  Sep- 
tember 27,  1975. 

The  provisions  of  the  Act  were  reviewed  briefly 
(Jabine) .  It  was  pointed  out  that  the  Act  applied  to 
record  systems  directly  controlled  by  Federal  agencies 
and  to  systems  operated  for  Federal  agencies  under 
grant  or  contract  with  private  agencies.  Retrievable 
records,  both  administrative  and  survey  types,  that 
identify  individuals  are  the  focus  of  this  legislation. 
The  Act  includes  the  following  provisions: 

1)  An  inventory  of  data  systems  will  be  published  so 
that  the  public  will  be  advised  of  the  existence  of 
all  data  systems  that  meet  the  provisions  of  the 
Act.  The  compilation  of  a  comprehensive  inven- 
tory of  eligible  data  systems  is  now  under  way. 

2)  The  prospective  respondents  must  be  informed 
about  the  authority  for  collecting  the  data,  whether 
mandatory  or  voluntary,  and  the  use  to  be  made 


of  the  data  gathered  in  the  survey.  The  confi- 
dentiality provisions  of  pending  data  systems  will 
be  made  known  to  the  public  by  notices  that  will 
be  inserted  in  the  Federal  Register  annually  or 
more  frequently  if  changes  are  made  in  the  con- 
fidentiality provisions. 

3)  Every  individual  has  a  right  to  know  what  infor- 
mation is  contained  in  his  data  records  in  the 
Federal  agency  and  to  request  corrections  in  his 
record  in  these  data  systems. 

4)  There  are  limitations  on  the  transfer  of  data  in 
record  systems  from  one  Federal  agency  to  another 
for  statistical  purposes.  A  release  from  each  indi- 
vidual is  required  unless  either  of  these  conditions 
is  satisfied: 

a)  A  notice  in  the  Federal  Register  states  that  re- 
lease of  the  information  to  specified  agencies 
for  specific  purposes  is  intended.  Transfers  to 
other  agencies  are  possible  if  the  purpose  is 
compatible  with  the  purpose  with  the  reasons 
the  data  were  originally  collected. 

b)  Transfer  of  information  to  the  Census  Bureau 
will  be  permitted  without  prior  notification  to 
the  respondent  if  the  information  being  re- 
leased is  related  to  the  Census  program. 

Discussion  by  participants  centered  on  the  provi- 
sions of  the  Act  and  their  interpretation.  It  was 
pointed  out  that  some  provisions  were  subject  to 
various  interpretations  and  that  it  was  not  clear  that 
the  Act  applied  to  all  Federal  agency  data  systems 
financed  under  contract  with  private  and  non-Federal 
agencies.  It  is  not  clear  how  the  Act  applies  to  data 
collected  in  the  past.  The  view  was  expressed  that 
the  Act  is  a  threat  to  statistical  collection  and  com- 
pilation operations  and  that  it  is  doubtful  it  would 
permit  repeating  some  important  statistical  studies 
involving  transfer  of  information  from  one  agency  to 
another  such  as  the  Birth  Registration  Test  and  the 
University  of  Chicago  National  Study  of  Social  Class 
Differentials  in  Mortality.  Also,  there  is  the  danger 
that  the  Privacy  Act  will  serve  as  model  legislation  for 
state  and  local  governments.  It  was  proposed  that  pro- 
fessional groups  acting  collectively  rather  than  inde- 
pendently, exert  influence  to  modify  the  legislation. 
An  avenue  for  such  action  might  be  through  the 
National  Research  Council,  which  has  a  committee 
(chaired  by  Dr.  Alice  Rivlin)  that  is  investigating 
the  negative  consequences  of  overly  restrictive  legisla- 
tion on  linking  data  files.  Furthermore,  the  Commit- 
tee on  National  Statistics,  National  Academy  of 
Sciences,  has  an  arrangement  with  the  Bureau  of  the 
Census  to  convene  a  panel  to  investigate  the  effects 
of  the  Privacy  Act. 

This  problem  is  not  peculiar  to  the  United  States. 
Several  Western  European  countries  have  in  recent 
years  adopted  legislation  to  protect  the  confidentiality 
and  privacy  of  individual  records.  In  Sweden  this  has 
restricted  the  scope  of  statistical  studies.  Dalenius 
(1974)  has  presented  an  overview  of  the  invasion  of 


privacy  problem  and  discusses  statistical  techniques  for 
overcoming  this  problem. 

The  effect  of  the  Act  with  regard  to  Social  Security 
numbers  was  mentioned.  The  opinion  was  expressed 
that  the  Act  did  not  ban  the  use  of  the  Social  Security 
number  on  non-SSA  records  but  makes  the  reporting 
of  the  number  a  voluntary  matter  for  the  respondent. 
Any  universal  identifier  as  a  requirement  is  banned 
by  the  Act. 

The  discussion  on  confidentiality  and  invasion 
of  privacy  may  be  summarized  as  follows:  ' 

1)  Recent  legislation  has  generated  a  conflict  of  in- 
terest between  the  need  to  protect  the  confiden- 
tiality of  individual  records  and  the  need  to  pro- 
duce essential  statistics  required  for  social  and 
economic  program  planning  and  evaluation.  This 
conflict  of  interest  presents  a  real  but  unnecessary 
threat  to  one  producing  essential  statistics. 

2)  This  conflict  of  interest  is  not  an  inherent  feature 
of  data  systems.  As  a  matter  of  fact,  traditionally, 
the  major  statistical  Federal  agencies  have  been 
the  most  outspoken  proponents  of  and  contribu- 
tors to  the  policy  and  practice  of  assuring  and 
providing  confidentiality  of  information  about  in- 
dividuals. 

3)  Recent  legislation  has  contributed  to  the  problem 
by  its  lack  of  specificity  and  clarity.  Thus,  the 
legislation  is  open  to  various  interpretations  and 
it  is  vague  about  matters  of  its  implementations. 
Its  primary  weakness,  however,  is  that  it  fails  to 
make  clear  a  distinction  between  statistical  rec- 
ords, administrative  records  used  for  programmatic 
and  regulatory  purposes,  and  administrative  rec- 
ords used  for  research  purposes. 

4)  Several  conference  attendees  proposed  that  pro- 
fessional societies  represented  by  statisticians,  so- 
cial scientists,  and  related  groups,  should  seek  to 
amend  the  Privacy  Act  and  influence  the  imple- 
mentation of  its  provisions  so  that  a  clear  and 
viable  distinction  is  made  between  record  systems 
used  for  statistical  purposes  and  those  used  for 
program  and  regulatory  purposes  while,  at  the 
same  time,  preserving  features  of  the  Act  that 
strengthen  the  confidentiality  of  records  of  indi- 
viduals. 

Record  Linkage 

This  subject  of  record  linkage   (Waksberg)  be- 
gan by  stressing  the  technical  difficulty  of  conducting 
linkage  studies,   especially  from   the  viewpoint  of 
matching  records  for  the  same  individuals  in  different 
data  sets.  Alternative  objectives  for  conducting  link- 
age studies  were  described  as  follows: 
1)  To  evaluate  statistics  that  are  generated  by  a  data 
system.  For  example,  the  Census  Bureau  has  con- 
ducted coverage  studies  involving  linkage  of  birth 
records  and  Census  records,  and  income  studies  in- 
volving IRS  and  Census  records.  Establishing  posi- 


tive matches  between  records  in  different  systems  is 
always  a  difficult  problem,  and  there  are  errors 
due  to  mismatches  and  nonmatches.  Frequently,  it 
is  difficult  to  decide  which  of  the  matched  records 
is  the  more  valid  one. 

2)  To  supplement  the  statistics  obtained  from  a  data 
system.  In  the  Medical  Economic  Study  being  con- 
ducted by  Johns  Hopkins  University  and  Westat 
under  contract  with  the  National  Center  for 
Health  Statistics,  the  household  expenditures  on 
medical  care  being  collected  from  a  panel  of  con- 
sumers in  a  household  sample  survey  are  being 
supplemented  by  data  obtained  from  the  records 
of  medical  sources  providing  the  care  and  the  rec- 
ords of  health  insurance  companies  making  the 
third  party  payments. 

3)  To  obtain  outcome  statistics  to  evaluate  a  non- 
statistical  program.  For  example,  the  effect  of  Em- 
ployment Training  Acts  is  being  evaluated  on  the 
basis  of  the  future  earnings  of  trainees  as  reflected 
in  Social  Security  records  of  the  trainees. 

The  Chairman  recommended  that  the  group  con- 
centrate on  the  first  type  of  study  and  address  the 
question,  "How  do  you  decide  which  record  is  cor- 
rect?" 

It  was  observed  that  there  is  no  universal  rule. 
In  many  studies,  however,  the  answer  is  reasonably 
clear.  For  instance,  utility  costs  based  on  the  records 
of  utility  companies  are  probably  better  than  those 
based  on  responses  to  a  household  survey.  Similarly, 
physician  costs  based  on  doctor  records  are  probably 
more  valid  than  those  reported  in  household  surveys. 
There  are  instances,  however,  in  which  one  cannot 
assume  that  records  of  physicians  are  more  valid  than 
responses  in  household  surveys.  For  example,  for  preg- 
nancies involving  a  fixed  cost  for  a  specified  regimen 
of  care,  the  records  of  the  obstetrician  may  fail  to 
list  all  visits  since  the  patient  is  not  charged  sepa- 
rately for  each  of  them. 

Several  participants  introduced  their  own  exper- 
iences in  trying  to  determine  validity  of  records.  In 
the  pretest  of  teen-age  drug  use  (Haberman) ,  al- 
ternate labeling  methods  were  investigated.  More 
drug  users  were  enumerated  when  the  persons  pro- 
vided their  names,  which  was  the  least  anonymous 
of  the  labeling  methods.  Records  being  used  for 
validation  purposes  may  be  plagued  by  the  same 
problems  as  the  statistics  being  validated.  For  ex- 
ample, the  D.C.  Drivers  Test  (Boisen)  attempted  to 
use  drivers'  licenses  to  estimate  undercoverage  of 
black  males  in  critical  age  groups  in  the  Census.  How- 
ever, the  addresses  on  the  drivers'  licenses  were  sub- 
ject to  gross  inaccuracies. 

There  is  reason  to  be  suspicious  about  equating 
an  increase  in  frequency  of  reporting  with  greater 
validity.  For  example,  studies  of  college  groups 
(Boruch)  indicate  that  there  is  overreporting  of 
marijuana  use  and  driving  while  drinking  and  other 
types  of  behaviors  approved  by  peer  groups.  Although 


the  guidelines  are  not  entirely  clear  the  Alcohol  and 
Drug  Abuse  Acts  of  1970  and  1971  empowered  the 
HEW  Secretary  to  grant  testimonial  privilege  to  social 
researchers  working  on  these  topics. 

There  are  ways  of  checking  physician  records  to 
determine  accuracy  in  reporting.  One  method  is  in- 
ternal consistency.  For  example,  for  persons  with 
allergies,  the  standard  practice  (Sudman)  is  to  require 
an  allergy  shot  once  every  week  or  two.  If  the  pa- 
tient reports  on  a  regular  basis  for  three  months  for 
a  weekly  shot  and  the  physician's  record  indicates 
sporadic  visits  by  the  patient,  one  suspects  that  it  is 
the  physician  who  is  in  error.  A  study  was  done  in 
Saskatchewan  (Fuchsberg)  comparing  household  in- 
terview survey  data  with  physician  records.  It  was  dis- 
covered that  15  percent  of  physicians'  claims  were 
not  filed  and,  hence,  never  appeared  in  the  record 
system. 

The  use  being  made  of  records  can  sometimes  be 
a  guide  to  the  records  validity.  For  example,  it  is 
socially  desirable  for  a  physician  to  specify  his  teach- 
ing hospital  affiliation  when  reporting  information  for 
AMA  records,  even  though  he  may  have  no  such 
affiliation  (Monteiro) . 

Sometimes  underreporting  may  be  due  to  match- 
ing problems.  In  one  study,  (Cannell)  about  15  per- 
cent of  the  admissions  were  not  reported  by  hospitals. 
By  repeatedly  returning  the  unmatched  names  to  the 
hospital,  the  underreporting  was  reduced  from  15 
percent  to  2  percent. 

There  is  a  growing  concern  about  the  effect  the 
invasion  of  privacy  and  record  linkage  problems  will 
have  on  response  rates.  For  example,  in  one  study 
(Woolsey)  hospital  records  are  being  used  to  estimate 
the  incidence  of  fairly  rare  diseases.  The  plan  is  to 
supplement  the  information  in  the  hospital  records 
by  conducting  surveys  with  the  patients  and  their 
families.  Thus,  the  names  and  addresses  of  patients 
as  recorded  in  the  hospital  files  are  needed.  Some 
hospital  authorities  have  concluded  that  family  au- 
thorization is  needed  before  their  hospitals  can  par- 
ticipate in  the  study.  One  solution  would  be  to  have 
the  hospitals  serve  as  the  agent  for  collecting  the  in- 
formation from  the  families;  another  solution  would 
be  to  have  the  hospitals  request  authorization  from 
the  families.  (Neither  solution  seems  to  be  ideal  be- 
cause of  inherent  problems  in  both.)  A  legal  type 
authorization  form  versus  an  informal  letter  approach 
is  being  considered. 

In  concluding  the  topic  of  record  linkage,  the 
point  was  made  that  perhaps  the  group  was  being 
overly  pessimistic  about  the  use  of  validation  studies. 
For  example,  the  factors  associated  with  disease  eti- 
ology are  rarely  determined  by  a  single  epidemiologi- 
cal study.  Similarly,  it  may  take  a  combination  of 
several  validation  studies  before  conclusions  can  be 
reached  about  bias  errors.  In  its  validation  studies, 
the  Census  Bureau  depended  on  several  studies  to 
estimate  response  errors. 


Sensitive  Questions 

The  Chairman  mentioned  that  at  least  five  tech- 
niques have  been  found  useful  in  gathering  data  on 
sensitive  questions,  protecting  data  confidentiality  dur- 
ing transmission  over  telephone  lines  and/or  while  in 
storage  on  computers  and  in  data  banks,  or  in  restrict- 
ing the  interpretation  of  published  data  so  that  con- 
fidentiality is  retained.  These  methods  are  by  no 
means  a  complete  listing  because  they  omit  such 
obvious  techniques  as  anonymous  replies,  sampling 
of  variates,  use  of  interval  measurements,  and  other 
procedures. 

Randomized  Response 

The  first  technique  discussed  was  randomized  re- 
sponse, a  technique  developed  only  ten  years  ago  by 
Stanley  Warner  (Warner:  1965) .  The  term  random- 
ized response  is  a  misnomer  because  it  is  really  a  re- 
sponse to  a  randomized  question.  To  illustrate  use 
of  the  technique  in  one  of  its  simplest  forms,  the 
Chairman  demonstrated  its  application.  His  objec- 
tive was  to  ascertain  what  proportion  of  the  confer- 
ence participants  had  cheated  on  their  Federal  income 
tax  in  the  year  1974.  For  purposes  of  definition, 
cheating  was  defined  as  the  underreporting  of  in- 
come, such  as  dividends,  interest,  honoraria,  or  con- 
sultant fees  by  $25  or  more,  and/or  the  overstatement 
of  deductible  items,  such  as  medical  expenses,  business 
expenses,  or  charitable  contributions  by  a  like  amount. 

The  participants  were  asked  to  take  a  coin  from 
their  pocket  and  proceed  as  follows: 

"Toss  the  coin  in  the  air  and  if  the  result  is 
'heads',  keep  that  fact  to  yourself  but  answer  the  sen- 
sitive question  'Yes'  or  'No*.  The  sensitive  question  is 
whether  or  not  you  cheated  on  your  income  tax  last 
year.  If  the  result  of  the  coin  toss  is  'tails',  answer  the 
nonsensitive  question  in  the  same  way.  The  nonsensi- 
tive  question  is  whether  your  mother  was  born  in 
the  month  of  April.  (If  you  do  not  know  your 
mother's  exact  month  of  birth,  substitute  your  own 
month  of  birth  but  keep  the  fact  to  yourself.)" 

The  Chairman  then  wrote  on  the  blackboard 
the  same  instructions. 

HEADS  I  cheated  on  my  Federal  income  tax 
last  year. 
(Yes  or  No) 

TAILS   My  mother  was  born  in  the  month  of 
April. 

(Yes  or  No) 

All  those  who  wished  to  reply  "Yes"  raised  their 
right  hand,  and  fourteen  were  counted.  Those  who 
wished  to  reply  "no"  then  raised  their  hands  and 
thirty-six  were  counted. 

14  Yes 
36  No 

50  Total  =  N 


The  Chairman  explained  that  if  there  had  been 
no  cheating  on  income  taxes  last  year,  the  only  per- 
sons who  would  have  raised  their  hands  would  have 
been  those  who  had  a  coin  turn  up  "tails"  and  whose 
mother  was  born  in  April.  The  expected  number  of 
such  persons  is  approximately  (%  x  %2)  °f  50,  or 
slightly  over  two.  Thus,  there  were  14—2=12  persons 
who  admitted  to  cheating  as  defined.  To  convert  this 
number  to  a  percentage,  only  one-half  of  the  50  per- 
sons would  be  expected  to  have  a  coin  turn  up 
"heads"  and  were  thus  supposed  to  answer  the  sensi- 
tive questions.  Hence,  the  proportion  of  cheaters  is 
calculated  as  follows: 

14-2  12 

 =  —  =  48% 

y2  (50)  25 

Several  questions  were  raised  by  the  conferees 
immediately  following  this  demonstration.  Would 
the  respondent  believe  that  randomized  response  did 
not  violate  his  privacy?  Would  he  not  be  equally  as 
willing  to  reply  anonymously  on  a  piece  of  paper 
placed  in  a  sealed  envelope?  Is  there  any  comparison 
of  cooperation  between  these  two  methods?  Does 
tossing  a  coin  create  more  suspicion  among  respon- 
dents than  other  randomizing  devices?  Has  there  been 
a  validation  of  responses? 

The  Chairman  responded  to  some  of  the  queries 
before  inviting  general  discussion  from  the  floor.  He 
said  he  knew  of  no  studies  comparing  the  randomized 
response  versus  the  anonymous  reply  placed  in  an 
envelope  but  expressed  a  personal  preference  for  the 
coin  toss  because  of  a  possibility  that  an  interviewer 
might  open  the  envelope  and  record  the  respondent's 
name  after  leaving  the  interview.  With  respect  to 
validation,  he  mentioned  a  randomized  response  sur- 
vey on  annual  income  conducted  in  North  Carolina. 
The  results  for  both  black  and  white  families  were 
within  a  few  dollars  of  the  averages  published  by  the 
Bureau  of  Labor  Statistics  for  the  southeastern  part 
of  the  United  States. 

The  role  of  education  of  respondents  and  the 
possible  effect  of  education  on  respondent  coopera- 
tion was  raised.  The  (Brown  and  Harding:  1973) 
study  of  drug  use  among  officers  and  enlisted  men  was 
reviewed  (Horvitz) .  In  all  cases  of  drug  usage  except 
marijuana,  the  reported  use  was  greater  by  random- 
ized response  than  by  anonymous  questionnaire.  The 
increase  in  reported  use  was  greater  among  officers 
and  this  may  be  either  because  they  understood  the 
method  better  or  they  felt  more  threatened  by  the 
possibility  of  apprehension  through  the  anonymous 
questionnaire.  There  is  no  question  that  the  more 
threatening  the  respondent  perceives  by  the  question, 
the  more  value  there  is  attached  to  the  procedure  of 
randomized  response. 

In  a  study  of  induced  abortion  in  Taiwan,  Chow 
and  others  (I-Cheng,  Chow  and  Rider:  1972)  used 
randomized  response  among  the  general  population 
in  that  country.  In  fact,  they  used  complicated  sam- 


pling devices  consisting  of  a  volumetric  flask  containing 
colored  balls  and  a  cloth  bag  containing  colored 
stones.  The  results  were  similar  to  all  the  other  studies 
of  induced  abortion  in  such  populations.  The  rates 
estimated  by  randomized  response  were  high  and,  as 
indicated,  in  line  with  what  one  would  expect  to 
have  occurred  if  the  truth  could  be  ascertained. 

A  report  was  made  (Sudman)  on  studies  that 
suggest  in  special  situations  in  which  randomized  re- 
sponse reduces  underreporting,  although  it  does  not 
eliminate  bias.  Samples  were  selected  from  Court 
records  of  people  who  had  declared  bankruptcy,  and 
from  persons  arrested  for  drunken  driving,  and  also 
from  the  general  population.  Comparisons  of  under- 
reporting were  made  between  randomized  response, 
face-to-face  interviews,  telephone  interviews,  and  self- 
administered  questionnaire.  In  the  case  of  drunken 
driving,  randomized  response  was  best  although  there 
was  still  some  underreporting.  On  bankruptcy,  there 
was  zero  underreporting  using  randomized  responses. 
But,  when  dealing  with  socially  desirable  attributes 
such  as  voting,  randomized  responses  did  not  appear 
to  work  at  all.  In  summary,  not  all  response  errors  are 
eliminated  by  randomized  response.  Moreover,  when- 
ever data  are  to  be  classified  by  other  variables,  such 
as  age,  race,  or  sex,  the  sacrifices  are  even  greater 
because  the  sample  sizes  are  reduced. 

Questions  were  raised  as  to  whether  or  not  the 
respondent  might  feel  he  was  giving  away  his  privacy 
by  having  to  report  "Yes"  and  whether  there  was  an 
ideal  nonsensitive  question.  In  reply,  the  point  was 
made  that  the  nonsensitive  question  undoubtedly  af- 
fects the  respondent  cooperation  and  influences  the 
variability  of  the  estimate.  If  the  frequency  of  the 
nonsensitive  question  is  symbolized  by  v-y,  the  respon- 
dent member  of  the  sensitive  group  is  given  the  maxi- 
mum protection  when  iry=  I.  Thus,  in  the  demonstra- 
tion concerning  cheating  on  income  taxes,  if  the  coin 
toss  was  tails,  the  respondent  could  have  been  in- 
structed simply  to  reply  "Yes."  In  that  way,  at  least 
50  percent  of  the  replies  would  have  been  in  the 
affirmative  and  the  respondent  would  have  the  maxi- 
mum protection  were  he  in  the  sensitive  class.  Of 
course,  a  "No"  response  would  mean  the  respondent 
is  answering  the  question  on  cheating  and  privacy  is 
thereby  lost. 

If  7ry=l,  protection  for  the  sensitive  group  mem- 
ber is  greatest  but  the  variance  is  also  large.  Conver- 
sely, if  7TV  =  0,  it  is  almost  the  same  as  if  the  direct 
question  had  been  used  because  a  "Yes"  reply  indi- 
cates membership  in  the  sensitive  class.  Of  course,  in 
the  latter  case  the  sampling  variance  would  be  at  a 
minimum.  The  Chairman  felt  that  it  is  desirable  to 
try  to  select  ttv  at  approximately  the  same  level  of 
frequency  as  estimated  for  the  sensitive  question.  That 
choice  provides  adequate  protection  from  respondent 
suspicion  and  is  close  to  the  minimum  variance. 

More  research  on  this  issue  is  needed  on  this 
subject,  however,  as  well  as  on  respondent  perception 


and  how  it  relates  to  his  cooperation  and  willingness 
to  tell  the  truth.  Some  persons  might  perceive  a  dif- 
ference in  a  sampling  device  that  was  using  the  sen- 
sitive question  50  times  out  of  100  versus  one  with  a 
probability  of  one-half  from  a  coin  toss. 

It  was  pointed  out  that  there  was  a  need  for  a 
rough  idea  of  the  magnitude  of  the  bias  in  a  direct 
question  approach  to  decide  whether  the  increased 
sampling  variance  was  worth  the  sacrifice.  This  issue 
was  discussed  in  some  of  the  first  few  articles  on 
randomized  response  published  in  the  Journal  of  the 
American  Statistical  Association.  It  was  shown  there 
that  one  need  not  have  more  than  5  to  10  percent 
untruthful  reporting  of  a  binomial  variable  to  more 
than  compensate  for  sacrifice  resulting  from  the  ran- 
domized response  technique.  The  opinion  was  ex- 
pressed that  bias  will  vary  with  the  socioeconomic 
status  of  the  individual  and  how  threatening  the 
question  is  to  him.  More  research  is  needed  to  com- 
pare the  different  methods  so  as  to  measure  the  bias 
with  sufficiently  large  groups  of  respondents  of  dif- 
fering backgrounds  and  with  a  wide  variety  of  sensi- 
tive questions. 

Reference  was  made  to  a  study  in  which  a  validity 
comparison  confirmed  randomized  response  technique 
at  an  early  stage  of  its  development.  This  was  a  study 
of  illegitimacy  in  North  Carolina  in  which  sample 
households,  obtained  from  birth  records  on  file  with 
the  state  health  department,  were  visited  and  ran- 
domized response  used  to  determine  whether  or  not 
an  illegitimate  birth  had  occurred.  The  proportion  of 
illegitimate  births  in  the  sample  was  known  in  ad- 
vance and  was  used  for  comparison  with  the  random- 
ized response  estimate.  In  white  households  the  cor- 
rect answer  was  7.7  percent  illegitimate  births  whereas 
randomized  response  estimates  were  7.4  percent  was 
estimated.  Black  households  were  purposely  selected  to 
yield  an  illegitimacy  rate  of  around  50  percent.  The 
true  value  among  the  birth  samples  was  45.4  percent 
and  through  randomized  response  42.3  percent  was 
estimated.  This  latter  was  in  a  sample  size  of  less 
than  100. 

The  possibility  was  raised  as  to  whether  or  not 
a  direct  question  might  have  done  just  as  well  when 
dealing  with  illegitimate  births.  In  response  to  this, 
another  study  was  cited  in  which  a  direct  question 
was  asked  about  births  occurring  in  households.  Only 
households  in  which  a  birth  had  actually  occurred 
were  included  in  the  sample.  Among  illegitimate 
births,  over  50  percent  of  the  respondents  reported 
no  birth  had  occurred  in  that  household.  This,  of 
course,  is  a  very  large  bias. 

The  point  was  made  that  in  almost  all  surveys 
reporting  confidential  data  the  degree  of  bias  is  not 
shown.  If  one  reports  the  response  rate  at  85  percent, 
we  do  not  know  if  this  is  high  or  low.  It  was  also 
observed  (Dalenius)  that  one  advantage  of  random- 
ized response  would  be  if  a  court  were  to  subpoena 
survey  records,  nothing  could  be  used  against  a  par- 


ticular respondent.  This  is  another  reason  randomized 
response  is  so  valuable;  moreover,  developments  in 
the  technique  since  1965  have  been  tremendous.  Fi- 
nally, there  is  no  reason  randomized  response  cannot 
be  used  in  conjunction  with  other  methods,  includ- 
ing direct  response. 

In  summary,  the  consensus  seemed  to  be  that  ran- 
domized response  has  a  lot  to  offer  in  those  special 
situations  where  the  respondent  may  feel  threatened 
with  an  invasion  of  his  privacy.  More  research  and 
applications  need  to  be  undertaken  on  the  use  of 
sampling  devices,  new  designs,  use  of  innocuous  ques- 
tions, use  in  mail  surveys,  and  the  role  of  the  inter- 
viewer. The  interviewer  has  a  more  important  in- 
fluence in  randomized  response  than  with  structured 
interview  schedules  because  he  or  she  must  not  only 
be  convinced  of  the  value  of  the  method  but  be  pre- 
pared to  answer  questions  to  allay  any  respondent 
suspicions. 

Coding  Designs 

Coding  designs  are  methods  useful  to  collect  data 
as  well  as  to  protect  their  confidentiality  during  trans- 
mission over  telephone  lines  or  storage  in  computers. 
The  coding  procedure,  since  data  are  stored  in  binary 
sets,  is  a  series  of  0's  and  l's.  The  same  sequence 
is  used  to  decode  as  to  code  the  data  if  binary  sets  are 
used  as  in  a  computer. 

The  coding  designs  can  be  combined  with  ran- 
domized response  so  that  the  sequence  of  0's  and  l's 
may  be  random  as  long  as  the  program  of  generating 
them  is  kept  secret.  This  influences  the  calculation  of 
correlation  coefficients  between  sets  of  data.  It  was 
mentioned  that  about  1971  there  was  a  Ph.D.  thesis 
by  William  Barksdale  at  the  University  of  North 
Carolina  that  discussed  this  problem  of  correlated  data 
in  randomized  response. 

These  coding  techniques  are  useful  where  many 
persons  have  access  to  data  in  the  computer.  The  point 
was  made  that  a  cryptic  device  is  useful  to  protect 
confidential  data  inhouse.  Also,  in  publishing  data, 
sometimes  data  in  a  cell  may  be  subjected  to  random- 
ized response  as  long  as  the  marginal  totals  are  re- 
tained. (Reference  was  made  to  a  report  in  the  Office 
of  Education  on  this  subject)  . 

It  was  also  pointed  out  that  sometimes  a  face 
sheet  with  identifying  data  can  be  stored  separate 
from  the  data  sheets  as  long  as  there  is  a  linkage  file. 
Attending  Census  Bureau  representatives  confirmed 
their  desire  to  protect  the  privacy  of  data  transmitted 
over  telephone  lines. 

Weighing  Designs 

Weighing  designs  are  survey  techniques  that  can 
also  be  used  to  collect  sensitive  data.  The  original 
work  on  weighing  designs  done  by  Hotelling  many 
years  ago  weighed  small  objects  on  a  balance  scale. 
With  this  method  instead  of  measuring  separately 


the  weight  of  objects  X  and  Y  in  turn,  with  error  of  a 
for  each  weighing,  one  can  obtain  as  much  informa- 
tion in  two  weighings  as  from  four.  The  ingenious 
device  is  to  weight  X  +  Y,  and  then  X  — Y.  The  weight 
of  X  is  one-half  the  sum  of  these  two  results,  and  the 
weight  of  Y  is  one-half  the  difference  of  the  two  re- 
sults. The  result  is  that  each  estimate  has  the  same 
standard  error  as  if  it  were  based  on  the  mean  of 
two  direct  weighings  (Wallis  and  Roberts:  1956, 
Banerjee:  1975)  . 

This  concept  is  easily  transferred  to  collecting 
sensitive  data  where  Y  may  be  a  threatening  question 
and  X  not.  Thus,  suppose  one-half  of  the  sample 
respondents  are  asked: 

"How  many  times  did  you  go  to  the  movies  dur- 
ing the  last  month?  phis  How  many  abortions 
have  you  had  during  the  past  year?"  The  other 
half  of  the  sample  respondents  are  asked:  "How 
many  times  did  you  go  to  the  movies  during  the 
last  month?  minus  How  many  abortions  have  you 
had  during  the  past  year?" 
Obviously,  these  are  only  illustrations  of  the  tech- 
nique and  one  has  to  use  care  to  choose  X  so  that  it 
is  always  greater  than  Y  to  avoid  negative  numbers. 

There  are  many  variations  on  this  technique  that 
can  be  combined  with  randomized  responses,  such  as 
Federer  and  his  colleagues  at  Cornell  have  done.  They 
used  balanced  incomplete  block  designs.  Thus,  they 
were  interested  in  obtaining  estimates  of  seven  vari- 
ates  so  that  each  group  of  seven  respondents  was  asked 
to  report  the  totals  of  three  questions,  as  follows: 

Y1  =  X1  +  X2  +  X4 
Y2  =  X2  +  X,  +  X5 
Y3  =  X3  +  X4  +  X6 
Y4  =  X4  +  X5  +  X7 
YB  =  X,  +  Xa  +  X1 
Y6  =  X6  +  X7  +  X2 
Y7  =  X7  +  Xj  +  X3 

That  is,  the  first  respondent  in  each  group  of  seven 
was  asked  to  report  the  sum  of  variates  X1;  X2,  and 
X4.  The  second  respondent  was  asked  to  report  the 
sum  of  X2,  X3,  and  X5,  and  so  on.  The  interviewer 
does  not  know  which  three  questions  the  respondent 
has  added  for  his  answer  since  the  latter  drew  one  of 
the  seven  possibilities  at  random.  The  estimating 
equations  are  straightforward  (Smith,  Federer,  and 
Raghavarao:  1974,  Raghavarao  and  Federer:  1973). 

One  participant  (Waksberg)  observed  that  what 
worried  him  in  estimating  the  Y  question  in  the  sim- 
ple design  with  (X  +  Y)  is  that  the  variance  of  the 
nonsensitive  X  question  may  be  so  much  greater  and 
one  is  not  really  reducing  the  variance  of  Y.  One 
should  pick  an  X  with  a  low  variance  in  the  popula- 
tion. 

Contamination  or  Error  Inoculation  Methods 

The  Chairman  expressed  regret  that  Dr.  Boruch 
had  to  leave  the  conference  early,  because  the  latter 


has  contributed  many  ideas  and  applications  to  this 
area  in  which  contamination  procedures  are  purposely 
introduced  to  mask  the  true  value  of  an  observation. 
The  method  can  be  used  to  inject  error  by  the  re- 
spondent in  his  reply  so  as  to  protect  its  confiden- 
tiality. The  contamination  might  also  be  used  in 
data  storage  in  computer  files  or  whenever  confiden- 
tial data  are  published  for  small  cells  or  areas  that 
might  be  easily  identified. 

Dr.  Horvitz  commented  that  in  one  version  of 
this  technique  the  interviewer  directs  the  respondent 
to  use  some  randomization  choice  in  order  to  de- 
termine whether  to  lie  or  tell  the  truth  when  replying 
to  a  sensitive  question.  One  random  choice  might  be 
simply  to  lie  when  answering  a  sensitive  question, 
whereas  the  other  random  choice  would  be  to  tell  the 
truth.  What  happens  is  that  false  negatives  and  false 
positives  occur,  and  one  has  to  correct  for  them  in 
estimating  the  true  proportion.  Thus, 

Where  a  =  false  positive  rate 
p~a  p  =  false  negative  rate 

n~  ~  p  =  reported  proportion 

a    P  tt  =  true  proportion 

It  was  observed  that  this  is  somewhat  similar  to 
what  Dr.  Kenneth  Poole  reported  in  his  recent  article 
(Poole:  1974) .  Poole  was  interested  in  the  distribu- 
tion function  of  a  continuous  variable  and  used  in- 
come distribution  as  an  illustration.  He  combined  the 
contamination  with  randomized  response  by  asking 
the  respondent  to  multiply  the  true  response  by  a 
random  number  and  to  tell  the  interviewer  only  the 
final  result.  A  similar  technique  involves  adding  or 
substracting  a  random  number,  with  mean  zero,  to 
the  true  response  and  reporting  only  the  contaminated 
algebraic  sum.  A  question  was  asked  whether  this  is 
not  a  weighing  design  and,  if  not,  what  the  dif- 
ference is  between  a  weighing  design  and  contamina- 
tion. 

The  Chairman  stated  there  was  a  structural  simi- 
larity but  that  a  weighing  design  involves  the  report- 
ing of  sums  of  several  components,  not  individually 
identified,  without  any  inoculation  of  error.  In  the 
contamination  procedure,  one  adds  a  contaminant  at 
random.  For  example,  suppose  the  respondent's  in- 
come is  $20,000.  The  respondent  is  asked  to  choose 
a  random  number  between  1  and  5  from  some  de- 
vice. If  he  selects  3,  the  reported  answer  is  $60,000 
and  the  interviewer  would  not  know  whether  the  true 
value  is  $12,000,  $15,000,  $20,000,  $30,000  or  $60,000. 
The  contaminant  is  selected  at  random. 

As  noted  earlier,  Dr.  Boruch  was  not  able  to 
be  present  to  emphasize  the  value  of  this  procedure  or 
to  amplify  its  applications  in  certain  instances  of  sensi- 
tive data.  Boruch  has  compared  the  method  to  ran- 
domized response  both  in  a  theoretical  sense  and  in 
actual  field  trials.  The  design  can  also  be  made  more 
complicated  in  ways  other  than  by  simply  inoculat- 
ing false  positive  and  false  negative  errors.  Others 


have  examined  some  of  these  designs  and  two  handy 
references  are  the  papers  by  Warner  (Warner:  1971) 
and  Greenberg,  et  al  (Greenberg,  Horvitz  and  Aber- 
nathy:  1974).  Before  leaving  Dr.  Boruch  gave  the 
Chairman  a  copy  of  his  latest  effort  in  this  area.  The 
report  involves  the  use  of  the  technique  to  preserve 
data  file  confidentiality  (Campbell,  Boruch,  Schwartz, 
and  Steinberg:  1974) . 

Network  Surveys  of  Rare  and  Sensitive  Conditions 

Sirken  discussed  network  surveys  as  a  method  of 
dealing  with  sensitive  questions  by  protecting  con- 
fidentiality. 

The  health  and  related  conditions  about  which 
respondents  are  sensitive  and  feel  threatened  when 
asked  about  them  in  population  sample  surveys,  are 
often  rare  conditions.  Thus,  survey  estimates  of  these 
conditions  are  not  only  subject  to  substantial  under- 
reporting but  to  large  sampling  errors  as  well.  Various 
design  strategies  have  been  proposed  (Sirken:  1970) 
for  estimating  rare  health  conditions,  but  few  strat- 
egies have  been  proposed  for  estimating  conditions 
that  are  both  rare  and  sensitive.  In  these  remarks, 
Sirken  described  briefly  (1)  the  network  survey 
methods  for  controlling  both  the  sampling  error  and 
response  bias  and  (2)  an  interesting  application  of 
this  method  to  a  household  sample  survey  of  sub- 
stance use  that  was  recently  conducted  for  the  Michi- 
gan Office  of  Drug  Abuse  (Sirken:  1975) . 

The  essential  design  feature  of  the  network 
survey  of  substance  use  is  that  the  drug  user  is  per- 
mitted to  be  enumerated  at  more  than  one  enumera- 
tion unit.  To  adjust  for  the  contingency  that  not  all 
drug  users  are  eligible  to  be  enumerated  the  same 
number  of  times,  network  estimators  require  ancillary 
information  that  is  not  needed  by  the  estimators  of 
conventional  surveys  since  the  latter  would  not  permit 
the  same  drug  users  to  be  enumerated  more  than 
once.  Several  unbiased  network  estimators  have  been 
reported  (Birnbaum  and  Sirken:  1968,  Hsieh:  1970)  . 
One  of  these,  the  multiplicity  estimator,  weights  every 
enumerated  drug  user  by  the  inverse  of  the  number 
of  enumeration  units  where  the  user  is  eligible  to  be 
enumerated.  The  ancillary  information  needed  to 
calculate  the  counting  rule  weight  is  usually  collected 
from  the  person  who  reports  the  drug  user  in  the 
survey.  For  example,  if  the  household  survey  adopted 
a  counting  rule  that  made  drug  users  eligible  to  be 
reported  by  their  friends,  the  person  in  the  survey 
who  reported  a  friend  as  a  drug  user  would  also  re- 
port the  number  of  the  user's  friends. 

The  Michigan  Survey  of  Substance  Use  estimated 
the  prevalence  of  substance  use  during  the  preceding 
year  for  alcohol  and  15  different  kinds  of  dtugs.  Two 
sets  of  estimates  were  produced.  One  set,  referred 
to  as  the  conventional  or  self-estimates,  was  based  on 
questions  in  which  the  sample  persons  reported  their 
own  use.  The  other  set,  referred  to  as  network  or 
friends  estimates,  was  based  on  projective  questions 


to  which  the  sample  persons  reported  the  percentages 
of  their  friends  who  used  each  substance.  The 
friends  estimator  of  substance  use  was  the  average 
of  the  percentages  of  friends  users  reported  by  sample 
persons  in  the  survey.  The  friends  estimates  were 
between  50  and  200  percent  higher  than  the  self-esti- 
mates for  each  of  the  10  nonprescribed  and  illicit 
drugs  but  somewhat  smaller  than  the  self-estimates 
for  alcohol  and  for  4  of  the  5  prescribed  drugs.  The 
sampling  variances  of  the  friends  estimates  were  uni- 
formily  (25  to  50  percent)  smaller  than  the  sampling 
variances  of  the  self-estimates. 

The  friends  estimates  are  puzzling.  Why  are  they 
larger  than  the  self-estimates  of  nonprescribed  and 
illicit  drugs?  In  this  connection,  one  can  note  that 
the  question  about  friends  use  preserve  the  anonymity 
of  the  drug  users  since  their  identities  are  not  divulged 
by  the  friends  who  report  them  in  the  survey.  Hence, 
the  questions  on  nonprescribed  and  illicit  drug  use 
by  friends  might  be  less  threatening  and,  hence,  less 
subject  to  underreporting  than  the  questions  on  self- 
use  of  these  drugs.  However,  this  does  not  explain 
why  the  friends  estimates  appear  to  be  about  the 
right  order  of  magnitude.  A  possible  explanation  was 
offered  along  the  following  lines. 

The  friends  estimator,  being  a  network  estimator, 
would  be  unbiased  if  every  user  enumerated  in  the 
survey  were  weighted  by  the  inverse  of  the  number 
of  times  he  was  eligible  to  be  enumerated.  In  this 
case,  the  number  of  times  a  user  is  eligible  to  be 
enumerated  is  equal  to  the  number  of  the  user's 
friends.  In  the  Michigan  Survey,  however,  the  estima- 
tor weighted  the  enumerated  users  by  the  inverse  of 
the  number  of  the  respondent's  friends.  Thus,  a  suffi- 
cient condition  for  the  estimator  based  on  friends  use 
in  the  Michigan  Survey  to  be  unbiased  is  that  the 
friends  of  drug  users  each  have  about  the  same  num- 
ber of  friends  and  this  number  is  equal  to  the  number 
of  friends  of  the  drug  user.  The  condition  would  be 
satisfied,  if,  for  example,  the  friends  of  a  drug  user 
were  friends  of  each  other  and  none  of  them  had 
any  other  friends.  The  fact  that  the  friends  estimator 
has  smaller  sampling  variance  does  not  necessarily  im- 
ply that  it  is  superior  to  the  self-estimator  because  (1) 
the  estimates,  based  on  both  estimators,  are  subject  to 
measurement  errors  that  would  arise  in  conducting 
the  surveys,  and  (2)  for  fixed  sample  size  the  survey 
costs  are  greater  for  the  friends  estimator  than  for 
the  self-est;mator.  Selected  experiments  need  to  be  con- 
ducted to  estimate  the  mean  square  error  and  cost 
components  associated  with  the  two  estimators  to  de- 
termine the  conditions  under  which  one  or  the  other 
estimator  is  indicated.  Since  the  preliminary  findings 
from  the  Michigan  Survey  findings  are  intriguing, 
they  deserve  to  be  investigated,  replicated,  and  hope- 
fully improved. 

Dr.  Eckerman  suggested  that  it  was  possible  that 
the  friends  estimator  overstates  drug  use. 


This  may  simply  be  a  function  of  lack  of  know- 
ledge and  misapprehension  regarding  usage  by  others. 
Chanck  (1932)  ,  in  an  early  study  of  norms  in  a  rural 
community,  coined  the  term  "pluralistic  ignorance" 
to  account  for  the  fact  that  while  many  household  re- 
spondents actually  deviated  from  church  instilled 
norms  prohibiting  card  playing  and  use  of  alcohol  and 
tobacco  they  at  the  same  time  contended  their  neigh- 
bors and  friends  adhered  to  these  norms.  We  may  be 
encountering  a  similar  phenomenon  in  the  drug  abuse 
field  but  with  people  overestimating  rather  than  un- 
derestimating their  friends'  drug  usage.  He  suggested 
that  research  was  needed  to  investigate  this  matter. 
Sirken  agreed,  but  added  that  it  is  generally  believed 
that  population  surveys  underestimate  the  prevalence 
of  non-prescribed  and  illicit  drug  use. 

Record  Checks 

The  Chairman  asked  two  or  three  participants 
to  discuss  the  question  of  how  to  use  record  checks  to 
establish  validity. 

Mr.  Shapiro  noted  that  the  Health  Services  Re- 
search and  Development  Center  of  The  Johns  Hop- 
kins University,  in  collaboration  with  Westat,  Inc., 
is  testing  alternative  survey  methods  for  collecting 
information  on  medical  utilization  and  expenditures 
under  a  contract  with  the  National  Center  for  Health 
Statistics.  A  household  panel  is  being  requested  to 
maintain  diaries  on  health  care  experience  over  a  six 
month  period.  Two  experimental  variables  are  being 
tested:  periodicity  of  reinterview  (monthly  vs.  bi- 
monthly) and  type  of  contract  (in-person  vs.  tele- 
phone) .  The  cost-effectiveness  of  the  alternative  strat- 
egies is  being  measured  using  several  criteria  includ- 
ing accuracy  and  completeness  of  household  data 
determined  through  comparisons  with  data  records 
of  health  care  providers  and  third  party  payers. 

Dr.  Federspiel  discussed  the  Medicaid  Program  in 
Tennessee.  The  primary  objectives  of  the  project  are 
to  ascertain  validity  and  to  get  some  idea  of  the  extent 
of  improper  prescribing  of  drugs.  The  data  files  are 
being  studied.  The  file  of  medical  service  claims  that 
contain  the  diagnoses  for  services  provided  has  been 
matched  with  the  file  of  prescription  claims  that  iden- 
tifies the  purchased  drug.  The  matched  records  have 
disclosed  inappropriate  prescribing  of  some  drugs  in 
the  validation  process. 

Mr.  Jabine  reported  that  record  check  studies 
(Steinberg:  1973,  Scheuren,  Bridges,  and  Kills:  1973, 
Scheuren,  Kills,  and  Oh:  1973,  Learner:  1974,  Robbins 
and  Siegmund:  1974,  Dyer:  1974),  involving  inter- 
governmental agency  data  linkage  (the  Census  Bureau, 
Internal  Revenue  Service,  and  the  Social  Security 
Administration)  were  conducted,  to  improve  the 
quality  of  statistics  on  income  distribution.  The  studies 
have  linked  income  data  reported  in  the  Current 
Population  Survey  with  earnings  and  benefit  data 
reported  in  Social  Security  records.  The  Pilot  Link 


Study  was  conducted  in  1963,  and  the  Exact  Match 
Study  was  conducted  in  1973.  The  Social  Security 
number  was  one  of  the  variables  used  to  match  rec- 
ords in  the  two  data  systems.  Another  record  check 
study  currently  underway  involves  linking  of  reports 
of  Social  Security  income  payments  in  the  March 
1975  Current  Population  Survey  with  reports  of 
earnings  records  in  the  files  of  the  Social  Security 
Administration. 

It  was  further  observed  that  a  selected  bibliog- 
raphy has  been  completed  on  the  matching  of  person 
records  from  different  sources  (Garey  and  Hwang: 
1974) . 
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Summary  and  Conclusions 

Persons  collecting  data  in  surveys  should  always 
be  on  guard  to  ascertain  the  validity  of  the  responses. 
Bias  may  occur  not  only  because  the  information  in- 
volves a  sensitive  area  of  questioning  but  also  because 
the  respondent  may  not  know  or  remember  the  true 
facts. 

Sensitive  questions  can  be  ordinally  scaled  into 
four  categories  as  follows: 

1.  Illegal  actions  or  behavior  (e.g.  cheating  on 
income  tax,  driving  under  the  influence  of  al- 
cohol, speeding) . 

2.  Not  illegal  but  socially  deviant  behavior  (e.g. 
specific  unusual  sex  practices,  alcoholism  or 
other  drinking  patterns  frowned  upon  by  so- 
ciety). 

3.  Embarrassing  facts  (e.g.  bankruptcy,  failure 
in  school,  dishonorable  discharge) . 

4.  Confidential  data  wherein  privacy  is  sought 
(e.g.  earnings,  voting  behavior,  history  of  ill- 
ness) . 

In  attempting  to  validate  data,  one  technique  is 
to  use  other  available  records.  In  such  cases,  one  must 
be  careful  not  to  violate  the  provisions  of  the  Privacy 
Act  of  1974  especially  since  it  is  becoming  possible  to 
link  more  records  by  means  of  computers  and  the  use 
of  universally  used  identifying  numbers  such  as  found 
on  Social  Security  cards  and  drivers'  licenses. 

It  was  agreed  that  some  provisions  of  the  Act 
threaten  to  stop  many  of  the  kinds  of  legitimate  re- 
search and  validation  procedures  statisticians  have 
used  for  decades.  There  was  consensus  that  statisti- 
cians and  survey  researchers  should  attempt,  through 
their  professional  organizations  in  having  objection- 
able portions  of  this  legislation  modified  to  permit  re- 
search that  cannot  harm  an  individual  either  directly 
or  as  a  member  of  a  group.  Specifically,  the  point  was 
made  that  a  distinction  should  be  drawn  between  re- 
cord systems  used  for  statistical  purposes  (e.g.  vital 
statistics,  census)  and  those  used  for  program  and  reg- 
ulatory purposes.  Suggestions  were  made  to  channel 
these  efforts  through  the  National  Research  Council 
and  the  Committee  on  National  Statistics  of  the  Na- 
tional Academy  of  Sciences. 

Record  linkage  as  a  means  to  check  validity  was 
discussed  as  were  some  of  the  alternative  uses  of  link- 
ing records  from  various  systems.  Concentrating  on 
the  usage  of  record  linkage  for  the  purposes  of  con- 
firming validity,  it  was  pointed  out  that  frequently 
one  is  faced  with  having  to  decide  which  record  is  the 
correct  one.  One  should  not  assume  that  the  record 
with  the  larger  number  of  undesirable  attributes,  or 
greater  frequency  of  asocial  behavior,  is  automatically 
or  always  the  correct  one.  Also,  linkage  of  records  has 
another  problem  in  the  number  of  mismatches  and 
nonmatches.  It  was  the  consensus  that  matching  re- 
cords should  be  continued  in  many  studies  and  a 


larger  number  or  a  combination  of  sources  should  be 
sought. 

In  the  area  of  sensitive  questions,  there  was  dis- 
cussion of  five  procedures  found  to  be  useful  to  en- 
courage respondents  to  cooperate  and  to  reply  more 
truthfully.  The  first  technique  discussed  at  length  was 
the  randomized  response  in  which  the  question  itself 
is  selected  by  the  respondent,  and  the  data  gatherer 
does  not  know  from  the  answer  which  question  is 
chosen.  The  usefulness  of  this  method  was  considered 
in  various  settings  and  it  was  generally  agreed  that 
more  field  trials  should  be  conducted  to  ascertain  the 
method's  applicability  in  the  four  previously  listed 
categories  of  sensitive  questions.  Especially  needed  are 
studies  of  the  different  factors,  such  as  age,  education, 
and  economic  status  that  influence  the  respondent's 
comprehension  and  cooperation  in  randomized  re- 
sponse. In  addition  further  studies  are  needed  on  the 
sampling  device  used  to  select  the  question  although 
preference  seemed  to  fall  on  the  toss  of  a  coin.  The 
interviewer's  influence  in  the  use  of  randomized  re- 
sponse and  its  applicability  to  questionnaires  in  mail 
surveys  were  also  found  to  need  further  exploration. 

Two  other  methods  used  to  gather  data  on  sensi- 
tive issues  were  the  coding  designs  and  weighing  de- 
signs. The  former  is  interesting  because  it  is  also  easily 
adaptable  to  the  protection  of  the  confidentiality  of 
data  stored  in  computers  or  transmitted  over  tele- 
phone lines.  The  weighing  designs  are  not  a  new  con- 
cept and  have  a  structural  similarity  to  balanced  in- 
complete blocks  in  experimental  design. 

The  contamination  or  error  inoculation  method 
was  also  discussed  because  it  too  can  be  adapted  to 
protect  the  confidentiality  of  stored  or  transmitted 
data.  The  method  has  interesting  possibilities,  particu- 
larly with  educational  and  psychological  measures, 
and  further  research  should  be  conducted  to  study 
different  variations  of  the  technique  as  well  as  field 
trials  of  its  usefulness. 

The  fifth  and  last  technique  to  collect  data  on 
sensitive  issues  that  was  discussed  is  the  network  sur- 
vey involving  multiple  respondents.  It  has  a  certain 
resemblance  to  the  sociograms  and  sociometric  anal- 
vsis  to  study  each  respondent's  relationship  and/or 
rating  to  others  in  his  or  her  network.  The  consensus 
was  that  this  is  another  area  needing  further  explora- 
tion to  ascertain  not  only  how  to  improve  the  tech- 
nique of  estimation  but  also  field  trials  to  learn  where 
the  procedure  may  be  useful. 

Needed  Research 

1.  There  is  increasing  evidence  that  several  new 
techniques  in  survey  research  can  reduce  and  per- 
haps eliminate  the  bias  caused  by  untruthful  re- 
porting or  the  refusal  to  answer  questions  about 
sensitive  issues.  One  of  the  most  potentially  fruit- 
ful is  that  randomized  response  and  further  re- 
search is  needed  to  conduct  field  studies  to  estab- 
lish how  well  the  method  overcomes  bias  in  the 


four  categories  of  sensitive  questions  enumerated 
in  the  Summary  and  Conclusions.  Particular  at- 
tention should  be  focused  on  the  interviewer's  in- 
fluence as  well  as  the  effect  of  age,  and  socio-eco- 
nomic status  of  the  respondent. 
The  validity  of  data  gathered  by  surveys  and  spe- 
cial studies  should  always  be  examined  by  check- 
ing various  records  to  obtain  information  from 
other  sources.  Research  is  needed  so  that  survey 
users  can  learn  about  new  techniques  of  linkage 
in  order  to  match  records  for  validity.  Moreover, 
additional  studies  should  attempt  to  illustrate 
how  some  of  the  infrequently  used  sources  of  offi- 
cial statistics  might  be  developed  for  establishing 
validity  of  response.  These  needs  are  particularly 
great  when  the  data  involve  sensitive  issues  such 
as  those  enumerated  in  the  four  categories  men- 
tioned in  the  Summary  and  Conclusions. 
The  use  of  weighing  designs,  contamination  of 
data,  coding  systems,  and  network  surveys  using 
multiple  respondents  are  also  useful  techniques 
in  learning  about  sensitive  issues.  More  research 
is  needed  to  compare  the  efficiency  of  these  meth- 
ods vis-a-vis  the  direct  question  and  randomized 
response.  It  is  especially  important  to  ascertain 
which  procedure  is  optimal  under  specific  cir- 
cumstances. 

Further  research  in  randomized  response  is 
needed  to  establish  the  reaction  and  perception 
of  respondents  to  the  method  and  to  ascertain 
the  amount  of  risk  or  jeopardy  they  are  willing 
to  tolerate  before  refusing  to  cooperate  or  resort- 
ing to  untruthful  replies.  This  relationship  will 
vary  with  the  degree  of  sensitivity  as  scaled  in  the 
four  categories  listed  in  the  Summary  and  Con- 
clusions. 


5.  Additional  studies  should  be  made  of  the  accept- 
ability of  various  sampling  devices  used  in  ran- 
domized response  and  the  contamination  methods. 
These  include  decks  of  cards,  coins,  dice,  sealed 
transparent  plastic  boxes,  the  random  number 
target,  or  the  volumetric  spherical  flask  with 
colored  balls  invented  by  research  workers  at 
Johns  Hopkins  University. 

6.  More  studies  should  be  made  on  the  use  of  the 
randomized  response  involving  quantitative  var- 
iables. 

7.  Further  research  is  needed  on  the  usefulness  of 
randomized  response  and  contamination  meth- 
ods in  mail  questionnaires,  telephone  surveys, 
and  situations  other  than  the  personal  interview. 

8.  Studies  need  to  be  made  on  how  to  establish  the 
most  correct  record  when  multiple  record  checks 
are  instituted.  This  problem  is  especially  acute 
for  questions  that  may  involve  sensitive  items. 
Special  example  should  be  developed  to  illustrate 
how  to  use  longitudinal  studies,  additional  or 
supplemental  records,  specific  panels,  and  other 
respondents  to  ascertain  validity.  Also,  what  will 
be  the  involvement  of  respondents  themselves 
to  develop  techniques  to  improve  validity.  This 
need  for  validity  checks  is  especially  important 
when  overreporting  may  be  operative,  and  the 
fallible  assumption  is  sometimes  made  that  the 
source  showing  greater  use  of  frequency  is  auto- 
matically judged  to  be  the  correct  one.  Various 
hypotheses  need  testing  according  to  the  cate- 
gories of  varying  sensitivity  as  enumerated  in 
the  Summary  and  Conclusions.  In  some  cate- 
gories, overreporting  may  be  more  serious  than 
underreporting  as  the  source  of  the  bias. 


TOTAL  SURVEY  DESIGN 

Daniel  G.  Horvkz,  Ph.D.,  Chairman 
Kirk  Wolter,  Ph.D.,  Rapporteur 


Introduction 

The  chairman  opened  the  session  by  defining  and 
then  discussing  "total  survey  design"  in  some  detail. 
Total  survey  design  (TSD)  is  a  concept  that  implies 
a  balanced  allocation  of  survey  resources  among  the 
different  error  components  in  order  to  minimize  the 
total  error  of  estimate.  For  example,  the  researcher 
who  invests  a  portion  of  a  given  survey  budget  so  as 
to  reduce  bias  arising  in  the  measurement  process 
rather  than  using  the  entire  budget  to  reduce  the 
sampling  error  by  increasing  the  sample  size,  is  at- 
tempting to  apply  the  total  survey  design  concept.  If 
the  particular  budget  allocation  results  in  the  smallest 
total  survey  error  achievable  for  the  given  survey  con- 
ditions and  budget,  then  the  survey  researcher  is  suc- 
cessfully applying  the  TSD  concept. 

To  use  the  TSD  concept,  an  error  model  is  re- 
quired that  can  be  applied  to  surveys  in  general.  Such 
a  model  must  be  able  to  include  all  of  the  different 
error  components  that  arise  in  surveys.  The  Bureau  of 
the  Census  model  developed  by  Hansen,  Hurwitz,  and 
Bershad  (1961) ,  and  referred  to  earlier  in  the  Con- 
ference, is  such  a  model.  It  includes  separate  com- 
ponents of  error  such  as  the  pure  sampling  error  vari- 
ance, the  simple  response  variance  (a  measure  of  re- 
sponse reliability  or  response  consistency),  the  corre- 
lated response  variance  (most  often  associated  with 
interviewers) ,  the  interaction  of  the  response  error 
components  with  the  sampling  error,1  and  the  bias  or 
net  systematic  error.  This  model  was  originally  devel- 
oped for  dichotomous  variables  and  simple  random 
sampling.  Koch  (1973)  has  recently  extended  this 
model  to  the  multivariate  case  and  for  continuous  as 
well  as  qualitative  variables.  The  immediate  value  of 
this  extension  is  twofold.  First,  it  includes  complex 
bivariate  estimators,  such  as  a  regression  and  correla- 
tion coefficients  and  ratio  estimation.  Second,  it  is  not 
confined  to  simple  random  sampling,  but  may  be  ap- 
plied to  unequal  probability  sampling  designs. 


1  For  example,  this  component  can  arise  when  those  respondents 
whose  exact  measure  of  the  variable  of  interest  is  less  than  average 
tend  to  underreport  their  exact  measure  and  those  whose  exact  meas- 
ure is  greater  than  average  tend  to  overreport  their  exact  measure. 


Total  survey  error  models  also  have  extremely 
important  long-range  significance.  They  provide  a 
basis  or  common  frame  of  reference  for  putting  into 
proper  perspective  methodological  research  concerned 
with  improving  the  quality  of  surveys.  Thus,  alter- 
native survey  procedures  (sample  designs  and  measure- 
ment designs)  can  be  partially  evaluated  by  compar- 
ing the  relative  magnitudes  of  the  different  compo- 
nents of  error  in  the  total  error  model.  As  indicated 
by  the  TSD  concept,  complete  evaluation  of  alter- 
native survey  procedures  requires  a  cost  function  and 
knowledge  of  the  cost  components,  as  well  as  know- 
ledge of  the  various  error  component  parameters  in 
the  model.  It  follows  that  a  total  error  model  provide^ 
a  basis  for  evaluating  or  adding  up  the  value  or  mean- 
ing of  all  the  survey  methodological  research  con- 
ducted to  date.  Such  a  summation  would  quite  likely 
reveal  significant  gaps  in  our  knowledge  of  survey 
errors.  Nevertheless,  the  error  and  cost  models  together 
with  those  estimates  of  model  parameters  that  are 
available  for  a  given  survey  strategy,  will  provide  the 
feedback  mechanism  so  essential  to  more  cost-effective 
choices  of  future  survey  strategies. 

Survey  designs  that  permit  the  total  mean  square 
error  to  be  estimated  require  a  method  for  estimating 
the  correlated  response  variance  component.  When 
data  are  subject  to  correlated  response  errors,  that 
component  is  not  included  in  the  usual  variance  esti- 
mates. Also,  a  separate  procedure  for  estimating  the 
net  bias  is  often  required,  although  certain  sources  of 
bias  (or  adjustments  for  bias)  can  be  measured  as 
part  of  the  regular  survey  design. 

Two  additional  references  are  Bailer  (1975)  and 
Lessler  (1974).  The  first  of  these  provides  an  excellent 
discussion  of  the  various  error  components  in  the 
Bureau  of  the  Census  model  and  their  magnitudes 
for  selected  1970  census  variates.  The  second  reference 
provides  a  basis  for  making  rational  survey  design  de- 
cisions for  the  case  (mentioned  above)  in  which  some 
investment  is  made  to  eliminate  (reduce)  bias  by 
using  inexpensive  but  imperfect  measurements  on  all 
respondents  and  costly  but  accurate  measurements  on 
a  subsample  of  respondents. 


Questions  for  Discussion 

The  chairman  then  asked  the  session  participants 
for  discussion  of  the  following  question:  if  minimiza- 
tion of  the  total  error  of  estimate  for  a  given  survey 
resource  level  is  a  valid  survey  design  goal,  what  are 
the  implications  of  this  goal  for  the  design  and  con- 
duct of  methodological  studies? 

A  second  discussion  was  concerned  with  the  utility 
of  a  survey  error  parameter  (components)  comput- 
erized information  system;  that  is,  a  system  based  on 
a  total  survey  error  model  and  a  standardized  set  of 
error  component  definitions  acceptable  to  both  social 
scientists  and  statisticians.  Initially,  the  information 
system  would  contain  estimates  of  the  error  model 
parameters  for  survey  measurements  reported  in  the 
literature.  Once  established,  the  information  system 
would  be  available  to  the  survey  research  community 
in  general,  which  in  turn  would  contribute  new  data 
on  error  components  and  costs  from  future  surveys 
and  methodological  studies. 

Conceptually,  the  error  components  estimates 
would  be  stored  in  an  n-dimensional  matrix  with 
(n-1)  of  the  dimensions  providing  essential  descrip- 
tive information  of  the  specific  measurement  design, 
i.e.,  type  of  population  or  subpopulation,  context  of 
survey,  sample  design  (e.g.,  stratification,  size  of  sam- 
ple), variable  measured,  exact  wording  of  question, 
method  of  measurement  (e,g.,  personal  interview,  mail 
questionnaire,  telephone  interview) ,  and  relevant  cost 
data.  The  remaining  dimension  would  contain  the 
specific  error  component  parameters  such  as  the  sam- 
ple design  effect,  simple  response  variance,  and  bais. 

Total  Survey  Design  Discussion 

The  first  point  made  in  response  to  the  general 
question  (Jabine)  was  that  substantial  expenditures  of 
time,  money,  and  manpower  are  required  to  produce 
accurate  estimates  of  the  components  of  the  total 
mean  square  error.  This  was  illustrated  by  a  study 
conducted  at  the  Bureau  of  the  Census  (Jabine  and 
Tepping:  1973)  the  purpose  of  which  was  to  estimate 
each  of  the  components  of  the  total  mean  square  error 
for  certain  occupation  and  industry  items.  In  that 
study  accurate  estimates  of  all  the  components  were 
produced  except  for  the  bias  term.  To  estimate  the 
bias,  record  checks  were  performed.  However,  despite 
great  effort  and  substantial  expenditures,  the  result- 
ing standard  errors  of  the  bias  estimates  were  too  large 
to  admit  inferential  statements.  Consequently,  the 
authors  were  unable  to  evaluate  the  magnitude  of  the 
bias  of  the  variable  components  of  error. 

It  was  pointed  out  that  the  principal  uses  in- 
tended for  the  survey  data  must  be  considered  in  ap- 
plying the  TSD  concept  (Woolsey)  and,  in  fact, 
should  be  a  determining  factor  in  the  allocation  of 
funds  to  control  the  various  error  components.  For 


example,  if  one  is  estimating  the  change  in  unemploy- 
ment over  a  certain  time  period,  the  allocation  may 
differ  greatly  from  the  case  in  which  geographical 
comparisons  are  to  be  made  within  a  given  survey. 

The  chairman  agreed  with  this  comment,  adding 
that  the  discussion  of  the  acquiescent  respondent  the 
previous  day  illustrates  the  point  well.  Dr.  Carr  had 
suggested  a  measure  of  acquiescence  was  needed  for 
every  respondent  to  adjust  for  the  distortion  or  error 
it  introduced  into  data  on  anomie.  Since  the  principal 
purpose  of  a  survey  might  be  to  estimate  the  correla- 
tion between  anomie  and,  say,  social  status,  it  is  quite 
possible  that  a  TSD  approach  would  suggest  that  it 
would  be  more  cost-effective  to  measure  acquiescence 
on  only  a  subsample  of  the  respondents  and  then  use 
that  data  to  adjust  the  estimate  of  the  correlation  of 
interest. 

Many  trade-offs  are  involved  in  any  TSD  strategy. 
One  trade-off  discussed  (Fowler)  arises  when  consider- 
ing different  reporting  periods  for  which  data  are  to 
be  obtained.  For  example,  if  the  characteristic  of  in- 
terest is  the  number  of  recent  visits  to  a  physician,  the 
statistician  may  encounter  considerably  different  non- 
sampling  errors  depending  upon  how  "recent"  is  de- 
fined whether  to  refer  to  the  past  week,  the  past 
month,  or  the  past  year.  A  decision  to  use  the  past 
week  could  elicit  highly  accurate  data,  but  might  very 
well  require  a  considerably  larger  sample,  and  hence 
additional  cost,  to  achieve  a  sufficient  number  of  phy- 
sician visits  for  analysis  purposes.  From  a  cost  stand- 
point, it  may  be  better  to  ask  for  physician  visits  in 
the  past  year,  despite  the  increase  in  response  bias. 

In  spite  of  the  noted  difficulty  in  applying  the 
total  survey  design  concept,  it  was  remarked  (Dale- 
nius)  that  it  provides  the  only  rational  approach 
available  to  the  statistician.  Other  approaches  only 
produce  very  special  results  at  best.  The  overuse  and 
misuse  of  the  word  "optimum"  is  illustrative  of  this. 
In  reality  there  is  no  theory  for  an  "optimum  sam- 
pling design."  There  are  merely  local  optima  that 
apply  in  very  special  cases,  i.e.,  optimum  allocation  in 
stratified  sampling. 

Another  important  point  stated  (Jabine)  at  the 
session  was  that  the  statisician  or  health  services  re- 
searcher must  first  define  the  variables  with  which  he 
is  attempting  to  optimize  his  design  strategy.  This  is 
a  necessary  precondition  to  the  construction  of  a  total 
survey  design.  In  this  context,  the  possible  lack  of 
measures  of  the  components  of  error  appropriate  to 
surveys  concerned  with  estimating  changes  over  time 
was  noted. 

Finally,  it  was  pointed  out  (Bradburn)  when 
there  is  an  external  validity  criterion,  the  response 
error  may  include  a  bias  term,  i.e.,  a  deviation  from 
a  true  value.  When  there  is  no  such  criterion,  as  in 
the  case  of  attitudes,  only  response  variation  can  be 
measured. 


Total  Survey  Design  Matrix 

The  method  of  conceptualizing  the  TSD  informa- 
tion system  concept  suggested  by  the  chairman  was 
then  discussed.  This,  as  described  earlier,  would  in- 
volve a  large  matrix  whose  cells  would  contain  the 
cumulative  past  history  and  experience  of  survey  re- 
searchers. Such  a  matrix  would  identify  untapped 
areas  for  further  research  as  well  as  provide  guidance 
in  the  design  and  analysis  of  future  surveys. 

It  was  suggested  (Eckerman)  that  the  so-called 
design  matrix  was  a  useful  method  to  conceptualize 
total  survey  design  that  a  valuable  first  step  would  be 
to  lay  out  the  dimensions  of  the  matrix.  It  was  urged 
that  this  be  accomplished  initially  through  a  system- 
atic inventory  of  what  is  known  about  the  methodo- 
logical issues. 

Berelson  and  Steiner's  book  entitled  "Human 
Behavior:  An  Inventory  of  Scientific  Findings,"  while 
perhaps  too  general  for  a  person  trained  in  social 
psychology,  is  no  doubt  very  useful  for  the  uninitiated 
as  a  means  of  obtaining  an  overview  of  the  field.  In 
a  similar  way,  a  recommendation  might  be  forthcom- 
ing from  this  conference  toward  the  assemblage  of 
such  an  inventory  in  the  field  of  survey  research  as  a 
means  of  systematizing  for  the  beginner  what  is 
known. 

It  is  apparent,  in  a  compendium  of  surveys  of 
drug  abuse  developed  by  William  A.  Glenn,  that  the 
level  of  sophistication  extant  among  members  of  this 
conference  is  unfortunately  infrequently  found  in 
newly  burgeoning  areas  of  research.  There  should  be 
some  ready  means  of  familiarizing  researchers,  new  to 
the  field,  with  the  intricacies  of  survey  research.  The 
Census  Bureau's  Technical  Report  No.  34  is  a  useful 
beginning  but  involves,  in  just  the  first  volume,  over 
25,000  entries.  An  inventory  might  be  a  means  of 
highlighting  the  most  important  and  relevant  findings 
of  the  past  few  years. 

Another  important  point  mentioned  (Jabine) 
with  regard  to  the  design  matrix  was  the  need  to 
standardize  terms.  It  was  felt  that  with  a  recognized 
standard  set  of  terms,  the  matrix  would  serve  as  an 
information  system  from  which  data  and  insight  into 
the  survey  design  problem  could  be  retrieved. 

Reference  was  made  to  the  recent  effort  to  stand- 
ardize background  items  used  in  survey  research 
(Eckerman) .  This  task  has  been  carried  out  under 
the  direction  of  the  Center  for  the  Coordination  of 
Research  on  Social  Indicators  and  a  report  is  avail- 
able (Reeder).  An  effort  designed  to  evaluate  com- 
monly used  instruments  and  scales  in  surveys  in  terms 
of  their  reliability  and  validity  apparently  ran  into 
many  problems  (Hensler) ,  relatively  few  instruments 
could  be  evaluated.  Nevertheless,  the  results  are  to  be 
published  this  summer  as  a  monograph  entitled 
Health  Surveys  Reference  Index  (Reeder)  . 

In  discussing  the  TSD  matrix,  the  participants 
found  it  useful  to  distinguish  between  two  fundamen- 


tally different  worlds  of  health  services  researchers 
(Shapiro) . 

First,  there  are  those  researchers  who  are  in- 
volved with  repetitive,  large-scale  surveys.  Allied  with 
this  group  are  the  methodologists  whose  efforts  are 
directed  at  improving  survey  results  and  eliminating 
measurement  error.  It  is  perhaps  with  this  group  that 
the  prime  responsibility  rests  for  filling  out  the  cells 
of  the  design  matrix. 

The  second  group  of  researchers  is  those  who  are 
involved  with  "one-shot  surveys."  This  group  suffers  in 
that  monetary  resources  are  almost  never  available  to 
investigate  fully  or  estimate  each  of  the  components 
of  the  total  mean  square  error.  Moreover,  in  many 
cases  this  would  be  beyond  their  capabilities  even  if 
resources  were  available. 

A  discussion  of  the  dimensionality  of  the  TSD 
information  matrix  followed  along  with  its  implica- 
tions for  future  methodological  research  (Marquis) . 
The  principal  implication  of  this  discussion  was  that 
methodological  research  should  not  be  done  unless  the 
results  could  be  entered  into  the  information  matrix. 
Each  successive  study  should  add  to  the  store  of  knowl- 
edge by  filling  in  empty  cells.  It  was  recognized  that 
some  studies  would  not  necessarily  provide  data  along 
all  n-dimensions  of  the  matrix,  but  that  the  findings 
might  still  satisfy  admissibility  criteria  along  provid- 
ing only  conditional  results. 

Funding  Problems 

One  of  the  participants  (de  la  Puente)  re- 
marked that  precious  little  money  is  allocated  for  the 
design  of  "one-shot"  surveys.  The  chairman  remarked 
that  this  is  one  of  the  prime  motivational  features  of 
the  total  survey  design  matrix.  It  is  exactly  because 
the  ad  hoc  survey  designer  does  not  have  money  to 
investigate  errors  that  may  arise  in  his  design  that  he 
needs  guidance  from  other  sources.  The  TSD  matrix 
would  provide  this  guidance  by  making  available  the 
cumulative  experience  of  others  working  with  similar 
designs  and  data. 

Several  participants  (Sirken,  Shapiro,  and  others) 
expressed  concern  that  no  one  organization  has  the 
prime  responsibility  for  funding  or  performing  origi- 
nal research  on  methods  of  evaluating  and  dealing 
with  nonsampling  errors.  To  cope  properly  with  the 
nonsampling  error  problem  requires  detailed  and  ex- 
pensive planning,  yet  very  little  research  money  has 
been  made  available  exclusively  for  this  purpose. 

Another  viewpoint  was  also  expressed  (Greenberg 
and  Waksberg) .  This  view  was  that  any  large  agency 
has  the  responsibility  to  study  survey  methodological 
problems.  Although  this  view  has  been  expressed  many 
times,  the  combined  experience  of  the  conference  par- 
ticipants was  that  no  strong  trends  in  this  direction 
are  apparent  within  the  federal  statistical  establish- 
ment, except  for  such  agencies  as  the  Bureau  of  the 


Census  and  the  National  Center  for  Health  Statis- 
tics. 

It  was  noted  (Cannell)  that  the  primary  concern 
of  many  health  services  researchers  and  most  funding 
agencies  is  with  the  items  of  costs,  sampling  errors, 
and  response  rates.  In  this  connection,  it  was  sug- 
gested that  if  valid  estimates  of  additional  compo- 
nents of  the  total  mean  square  error  were  available, 
these  would  provide  researchers  with  persuasive  argu- 
ments in  their  attempts  to  obtain  research  funds. 

A  vital  element  of  the  total  survey  design  matrix 
then  must  be  cost.  Survey  designers  need  to  know  the 
cost  trade-offs  involved  in  their  design  alternatives.  A 
design  matrix  would  provide  this  information  by  act- 
ing as  a  "shopping  list."  This  list  would  inform  the 
methodological  researcher  of  the  extent  of  the  bias, 
simple  response  variance,  correlated  response  vari- 
ance, and  sampling  variance  he  is  buying  with  a  par- 
ticular survey  procedure. 

The  comment  was  made  (Dalenius)  that  uniform 
definitions  of  costs  would  be  required  before  the  cost 
"shopping  list"  could  be  formulated.  Presently,  uni- 
form international  cost  definitions  do  not  exist. 

Hansen,  Hurwitz,  Bershad  Model 

As  discussed  in  the  introduction,  one  of  the  early 
attempts  to  provide  a  mathematical  model  for  non- 
sampling  errors  was  the  Hansen,  Hurwitz,  and  Bershad 
(1961)  model.  That  model,  amplified  by  Hansen, 
Hurwitz,  and  Pritzker  (1964) ,  has  been  used  exten- 
sively at  the  Bureau  of  the  Census. 

The  comment  was  made  (Woolsey)  that  even  in 
the  absence  of  a  total  survey  design  matrix,  the  mere 
existence  of  the  Hansen,  Hurwitz,  Bershad  model  has 
provided  survey  statisticians  with  valuable  insight  into 
the  design  process.  By  making  educated  guesses  of  the 
costs  and  the  magnitudes  of  error,  the  survey  designer 
has  been  able  to  make  sensible  choices  amongst  de- 
sign alternatives  by  reference  to  the  measurement 
error  model. 

In  a  related  comment,  it  was  noted  (Sirken)  that 
although  the  Hansen,  Hurwitz,  Bershad  model  iden- 
tifies the  parameters  associated  with  measurement  er- 
ror, the  survey  profession  has  not  as  yet  provided  esti- 
mates of  these  parameters.  This  was  illustrated  by 
noting  that  nonresponse  percentages  are  frequently 
quoted  while  the  magnitude  of  the  biasing  effect  of 
nonresponse  is  rarely  known  (Schuman)  . 

A  contrary  point  of  view,  was  expressed  that  esti- 
mates of  some  measurement  error  parameters  have 
been  presented.  The  important  question  then  is  how 
much  does  the  aggregate  survey  profession  know  re- 
garding nonsampling  error  levels? 

Presentation  of  Nonsampling  Errors 

Although  much  of  the  discussion  centered  on 
problems  of  funding,  it  was  also  suggested  (Waks- 
berg)  that  perhaps  it  is  not  yet  known  how  reports 


of  nonsampling  error  levels  should  be  presented  for 
specific  surveys.  This  problem  has  been  discussed  with- 
out resolution  at  the  Census  Bureau,  where  reports  of 
separate  components  of  measurement  error  have  been 
published,  but  where  a  total  mean  square  error  figure 
has  yet  to  be  produced. 

Additionally,  it  was  noted  (Jabine)  that  the 
Census  Bureau  possesses  numerous  unpublished  esti- 
mates of  certain  components  of  the  mean  square 
error.  It  would  be  useful  to  health  services  researchers 
and  the  survey  profession  in  general  if  Census  pub- 
lished more  of  these  estimates  in  conjunction  with 
survey  results. 

With  regard  to  the  presentation  of  errors,  a  call 
(Dalenius)  for  more  honest  reporting  was  made.  Often 
survey  results  are  claimed  to  be  statistically  significant 
when  in  fact  the  estimated  standard  errors  are  so  large 
as  to  render  such  statements  false.  Moreover,  full  dis- 
cussions of  all  the  errors  of  estimate  are  rarely  pre- 
sented in  the  survey  literature.  By  failing  to  account 
for  all  error  sources,  the  survey  researcher  understates 
the  total  level  of  error  and  makes  inferential  state- 
ments based  on  the  understated  errors.  More  honest 
reporting  throughout  the  profession  would  be  helpful 
in  this  regard. 

Other  Comments 

It  was  noted  (Dalenius)  that  in  spite  of  funding 
difficulties  and  the  general  lack  of  methodological  re- 
search on  nonsampling  errors,  great  progress  has  oc- 
curred in  the  past  25  years.  Illustrative  of  this  is  the 
Birnbaum  and  Sirken  (1950)  paper  that  dealt  with 
the  optimum  size  of  nonresponse.  This  helped  re- 
searchers to  understand  the  trade-offs  inherent  in  a 
total  survey  design. 

Also  with  regard  to  the  nonresponse  problem,  it 
was  pointed  out  (Fowler)  that  perhaps  health  serv- 
ices researchers  should  make  a  more  concerted  effort 
to  obtain  information  about  nonrespondents  than  is 
currently  practiced.  It  was  felt  that  many  nonrespon- 
dents would  provide  information  related  to  the  prin- 
cipal objectives  of  the  survey  even  though  refusing  a 
formal  interview.  Such  information  could  be  quite 
valuable  in  assessing  the  bias  due  to  nonresponse. 

It  was  remarked  (Carpenter)  that  many  forms  of 
response  error  may  result  from  the  respondents'  lack 
of  perception.  In  some  studies  (Hamblin:  1971),  this 
lack  of  perception  has  been  shown  to  be  physiological. 
In  such  cases,  the  bias  may  be  predictable.  Once  meas- 
ured, this  perception  bias  could  be  adjusted  for  future 
surveys  in  which  similar  general  conditions  prevailed. 

Of  course,  it  does  not  necessarily  follow  that  the 
magnitude  of  error  observed  in  one  survey  will  be  re- 
plicated in  another  survey.  Nevertheless,  it  seems  clear 
that  many  kinds  of  error  will  repeat  themselves,  and 
to  deal  with  these  sources  it  is  necessary  to  build  a 
reserve  of  past  information  upon  which  to  base  future 
decisions.  The  chairman  observed  that  this  is  the 


motivation  for  the  construction  of  the  total  survey 
design  matrix. 
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Summary  and  Conclusions 

Although  the  Total  Survey  Design  (TSD)  session 
generated  some  interesting  discussion,  it  seems  clear 
that  the  concept  needs  considerable  further  discussion 
in  the  literature  in  terms  readily  understood  by  sur- 
vey practitioners.  The  possible  implications  of  TSD 
to  the  design  and  conduct  of  methodological  research 
are  not  immediately  obvious  and  were  not  really  dis- 
cussed in  any  great  detail  in  the  TSD  session.  Thus, 
the  question  remains  unanswered  whether  it  is  useful 
to  carry  out  methodological  studies  that  concentrate 
on  a  single  component  of  error,  thereby  producing  re- 
sults that  are  highly  conditional. 

The  need  to  measure  the  cost  components  asso- 
ciated with  a  given  measurement  design  in  order  to 
be  able  to  make  use  of  the  findings  in  the  TSD  sense 
requires  emphasis  and  reemphasis.  The  lack  of  de- 
tailed cost  data  for  specific  alternative  measurement 
designs  that  are  readily  available  for  use  by  the  survey 
research  community  is  viewed  as  serious.  For  example, 
the  TSD  concept  can  be  used  very  effectively  to  choose 
the  appropriate  length  of  recall  period  for  reporting 
utilization  of  health  services,  say,  provided  methodo- 
logical studies  were  carefully  designed  to  measure  the 
variable  error  components  as  well  as  bias  components 
for  different  reference  periods  covering  a  sufficiently 
wide  range  of  alternatives.  Clearly  the  costs  associated 
with  collecting  sufficient  data  for  annual  statistics,  for 
examp'le,  as  well  as  the  magnitude  of  the  error  com- 
ponents, can  contribute  significantly  to  the  ultimate 
choice  of  recall  period. 

There  was  no  clear  consensus  by  the  participants 
concerning  the  utility  of  the  suggested  information 


matrix  for  survey  error  components  and  cost  compo- 
nents. On  the  one  hand,  the  value  of  such  information 
can  only  be  assessed  in  a  context  in  which  the  need 
for  better  decisions  among  alternative  survey  strate- 
gies, and  the  role  of  the  TSD  concept  in  meeting  that 
need,  is  widely  recognized  and  accepted.  On  the  other 
hand,  the  very  act  of  establishing  an  information  sys- 
tem for  use  by  the  survey  research  community  could 
substantially  expedite  the  process  of  improving  the 
overall  cost-effectiveness  of  survey  designs. 

The  TSD  session  generated  discussion  concerning 
the  lack  of  adequate  funds  committed  to  research  on 
health  survey  methods.  The  growing  need  for  more 
detailed  and  accurate  data  on  health  care  needs  and 
the  utilization  of  health  care  facilities  and  services 


emphasizes  the  survey  research  budgetary  concerns  of 
those  participating  in  the  session. 

Needed  Research 

1.  Implementation  of  research  priorities  in  other 
sections  of  this  report: 

a.  Necessitates  the  consideration  of  TSD  as  an 
integral  part  of  their  respective  protocols;  and 

b.  Is  essential  to  the  orderly  development  of 
TSD  theory. 

2.  Additional  research  is  needed  in  the  development 
of  methodologies  for  improved  simulation  and 
modeling  techniques  to  determine  the  cost-effec- 
tiveness of  various  sampling  design  mixes  and 
measurement  strategies. 
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GLOSSARY 

Aquiescence—A  tendency  of  the  respondent  to  base 
his  reply  on  some  stimulus  other  than  the  question 
content.  It  may  be  stimulated  by  the  desire  to  please 
the  interviewer,  the  agency  collecting  the  data  or 
some  other  cue  such  as  the  unbalanced  question. 
Agree-disagree— Form  of  question  in  which  the  respon- 
dent responds  by  stating  his  or  her  concurrence  or 
non-concurrence  with  a  statement. 

Anonymous  Replies—Survey  information  gathered  in 
which  the  respondent's  identity  cannot  in  any  way 
be  linked  to  the  information  provided. 
Bias  (or  Net  Systematic  Error)  of  a  Survey  Estimate— 
The  difference  between  the  expected  value  (taken 
over  the  sampling  design  and  the  distribution  of 
measurement  errors)  of  the  estimator  and  the  "true" 
value  of  the  parameter  being  estimated.  This  is  par- 
ticularly acute  in  surveys  concerning  sensitive  or  con- 
fidential matters,  and  in  which  it  might  be  expected 
that  the  estimates  are  consistently  below  or  above  the 
true  population  parameter.  A  consistent  pattern  of 
under  or  overreporting  will  result  in  bias. 


Binomial  Variable—In  the  context  of  survey  research, 
a  response  to  a  question  to  which  only  two  choices  are 
possible.  The  respondent  is  instructed  to  pick  the 
one  which  better  describes  his  condition,  behavior,  or 
experience.  The  question  is  usually  referred  to  as 
being  on  a  dichotomous  nominal  scale. 

Bounded  Recall— An  interview  where  the  respondent 
is  reminded  of  what  he  reported  in  an  earlier  inter- 
view and  is  then  asked  only  to  report  on  any  new 
events  that  occurred  subsequent  to  the  bounding  in- 
terview. 

Contamination  Procedures— A  procedure  in  which  the 
respondent  is  instructed  to  perform  a  simple  proba- 
bility exercise,  and  for  a  given  outcome,  to  answer 
the  survey  question  correctly  or  incorrectly.  Given 
the  distribution  of  the  probability  exercise  and  the 
reported  rate  of  the  behavior,  the  true  rate  can  be 
estimated.  In  some  cases,  the  probability  exercise 
may  instruct  the  respondent  to  add  or  multiply  a 
random  variable  to  the  true  response  in  order  to 
protect  the  privacy  of  the  reply.  These  procedures 
are  designed  to  protect  the  survey  respondent's  pri- 
vacy. 

Coding  Procedures— Techniques  for  providing  unique 
numerical  designations  to  data  such  that  quantitative 
analysis  of  the  data  can  be  performed.  These  tech- 
niques may  be  used  for  assigning  labels  to  survey  re- 
spondents which,  while  allowing  identification  of  data 
as  coming  from  a  single  source,  protects  the  identity 
of  the  person  who  is  that  source.  The  method  can 
also  be  used  to  conceal  the  true  value  of  data,  es- 
pecially that  stored  in  computers,  so  that  interpreta- 
tion of  the  coded  data  is  impossible  and  meaningless 
until  the  data  are  decoded. 

Cost  Model— A  mathematical  formulation  of  the  costs 
which  would  be  incurred  through  use  of  a  given  sam- 
ple design.  The  preferred  survey  design  is  that  which 
minimizes  the  total  mean-square  error  of  estimate  for 
a  given  survey  cost  or  which  yields  minimum  cost 
given  a  specified  level  of  precision. 

Cryptic  Device— A  code  or  system  of  codes  which  con- 
ceals the  true  state  of  affairs.  It  may  be  used  to  conceal 
the  identity  of  survey  respondents. 

Cue— Some  characteristics  of  the  interview  or  the  in- 
terviewer, the  question  wording  or  the  interviewer's 
behavior,  including  feedback,  which  influences  the  di- 
rection of  answers  to  one  or  more  questions. 

Diary— A  written  record  kept  concurrently  by  an  indi- 
vidual respondent  or  household  about  events  that 
would  usually  otherwise  be  difficult  to  remember. 

Dichotomous— A  random  variable  is  said  to  be  dichot- 
omous if  it  assumes  only  one  of  two  responses  or 
values. 

Distribution  Function— A  (cumulative)  distribution 
(cdf)  is  the  total  frequency  of  members  of  a  variate 
with  value  less  than  or  equal  to  some  point,  x.  Proba- 
with  values  less  than  or  equal  to  some  point,  x.  Proba- 
(pdf)  provides  the  probability  of  a  value  of  x  as  a 


function  of  x.  The  pdf,  or  frequency  function,  can  be 
regarded  as  the  derivative  of  the  distribution  function. 

Error  Model— A  mathematical  relationship  which 
postulates  the  manner  in  which  both  sampling  and 
nonsampling  errors  arise  in  the  conduct  and  analysis 
of  a  sample  survey. 

The  measurement  error  model  developed  at  the 
Bureau  of  the  Census  postulates  that  each  survey  re- 
sponse is  a  realization  of  a  random  variable  possessing 
finite  second  moments.  Under  this  model  the  total 
variance  of  a  survey  estimate  may  be  divided  into 
several  components: 

Variance 

a.  The  simple  response  variance  contribution  to  the 
total  variance  arises  from  the  variability  of  each 
survey  response  about  its  own  expected  value.  In 
terms  of  a  simple  random  sampling  design,  the 
simple  response  variance  is  the  population  mean 
of  the  variances  of  each  population  unit. 

b.  The  correlated  response  variance  is  the  contribu- 
tion to  the  total  variance  arising  from  non-zero 
correlations  (in  the  sense  of  the  distribution  of 
measurement  errors)  between  the  responses  of 
sample  units. 

c.  The  response  variance  of  a  survey  estimator  is  the 
sum  of  the  simple  response  variance  and  the  cor- 
related response  variance. 

d.  The  sampling  variance  is  that  contribution  to  the 
total  variance  arising  from  the  random  selection 
of  a  sample,  rather  than  a  complete  enumera- 
tion, from  the  population. 

e.  The  interaction  contribution  to  the  total  variance 
of  estimate  is  that  component  arising  from  a  non- 
zero covariance  between  measurement  error  and 
sampling  error. 

External  Validity  Criterion— When  an  independent 
source  of  information  exists  regarding  a  population 
being  surveyed,  then  the  individual  survey  responses 
may  be  checked  (vis-a-vis  the  independent  source) 
for  accuracy. 

Follow-up— A  procedure  whereby  those  members  of  a 
selected  sample  for  whom  a  response  is  not  obtained 
by  one  data  collection  strategy  (e.g.,  telephone  or 
mail)  are  contacted  by  the  same  or  another  data  col- 
lection strategy  in  order  to  increase  response  rate.  It 
can  also  be  used  to  designate  repeated  surveys  among 
a  panel  of  respondents. 

Ingratiating  Behavior— Behavior  on  the  part  of  either 
the  respondent  and /or  the  interviewer  designed  pri- 
marily to  please  the  other  person. 

Interviewer  Feedback—Some  verbal  or  non-verbal 
communication  by  the  interviewer  in  response  to  re- 
spondent behavior. 

Linkage— The  process  or  technique  for  joining  data 
describing  a  single  sampling  unit,  usually  a  person, 
from  one  or  more  primary  data  sources.  If  institu- 
tional records  were  being  used  to  check  information 
provided  by  a  survey  respondent,  such  as  a  record 


check,  linkage  would  concern  whether  the  institu- 
tional records  apply  to  the  survey  respondent,  whether 
the  respondent  and  the  records  refer  to  identical 
points  in  time  and  spatial  locus,  and  whether  the 
respondent  and  institutional  source  define  relevant 
matters  in  congruent  ways.  In  other  cases,  the  link- 
age might  serve  as  supplemental  information  about 
a  respondent  in  order  to  facilitate  correlations  and 
other  analyses  of  association. 

Matrix  Sampling— A  procedure  to  reduce  the\  length 
of  complex  questionnaires  by  asking  any  respondent 
only  a  subset  of  all  the  questions  of  interest,  rotating 
the  subset  among  the  respondents. 

Memory  Failure  or  Decay— The  universally  observed 
phenomenon  that  the  longer  ago  the  event  occurred 
in  the  past,  the  more  likely  the  respondent  is  to  have 
difficulty  recalling  the  event.  This  rule  may  not  hold 
true  where  the  event  is  associated  with  some  dramatic 
period  of  time  in  the  life  of  the  respondent. 

Monotonic— Referring  to  data  that  always  move  in  the 
same  direction  or  are  constant  with  reference  to  time 
or  another  variable.  The  data  never  move  in  the 
opposite  direction. 

Multiplicity  Estimators— An  unbiased  network  esti- 
mator that  weights  the  sample  elements  by  the  in- 
verses of  the  number  of  enumeration  units  at  which 
they  are  eligible  to  be  enumerated.  The  information 
needed  to  determine  the  weight  is  collected  in  the 
survey  from  the  enumeration  units  that  report  the 
elements. 

Network  Estimators— Estimators  which  adjust  for  the 
varying  probabilities  of  enumerating  elements  in  net- 
work surveys  by  appropriately  weighting  the  sample 
elements. 

Network  Survey— In  this  context  a  technique  of  esti- 
mating the  incidence  of  behaviors  which  are  both 
rare  and  sensitive.  The  respondent  is  instructed  to 
report  the  number  of  his  friends  who  have  committed 
the  behavior  under  consideration,  and  then  to  esti- 
mate the  number  of  friends  of  his  friends  who  have 
committed  the  behavior. 

Non-Response  Rate— The  complement  of  response 
rate.  The  numerator  is  those  eligible  respondents  se- 
lected in  a  sample  for  whom  information  is  not  ob- 
tained because  of  refusals,  not  found  at  home,  un- 
available by  reason  of  illness,  incompetence,  language 
difficulty,  etc.  The  denominator  is  the  total  number 
of  eligible  respondents  initially  selected  for  the 
sample. 

Overreporting— Survey  responses  which  produce  a 
higher  estimate  of  the  incidence  of  some  event  or 
characteristic  than  is  accurate. 

Panel— A  study  design  involving  re-interview  or  a 
series  of  questionnaires  with  the  same  sample  or  re- 
spondents (or  household  units)  at  two  or  more  dif- 
ferent times.  Usually  used  to  study  changes  over  time, 
giving  rise  to  longitudinal  data. 


Proxy  Respondents— Respondents  who  provide  infor- 
mation about  other  persons,  generally  within  the  same 
household,  in  addition  to  or  instead  of  providing 
information  about  themselves. 

Random  Digit  Dialing— A  procedure  for  obtaining  a 
probability  sample  of  households  with  telephones. 
Numbers  are  selected  at  random  from  exchanges 
without  prior  knowledge  of  whether  they  are  work- 
ing numbers,  business  numbers,  or  residential  house- 
hold numbers.  The  strength  of  the  procedure  is  the 
inclusion  of  those  households  with  unlisted  numbers. 
Caution  must  be  taken  to  assure  that  the  digits  used, 
whether  terminal  or  otherwise,  are  uniformly  distri- 
buted. 

Randomized  Response— An  answer  to  a  question  or 
set  of  questions  randomly  selected  from  a  defined 
larger  universe  of  questions  in  a  survey.  The  techni- 
que is  particularly  well  suited  for  obtaining  sensitive 
information  and  assuring  the  respondent  of  anonym- 
ity. The  respondent  is  instructed  to  respond  to  the 
sensitive  question  after  a  simple  probability  exercise 
in  selecting  the  question.  Since  only  the  respondent 
knows  the  outcome  of  the  probability  exercise,  there 
is  no  possibility  that  his  or  her  response  can  be 
linked  with  certainty  to  his  or  her  relationship  with 
the  sensitive  issue.  Since  the  distribution  of  outcomes 
to  the  probability  exercise  is  known  in  advance,  it  is 
possible  to  estimate,  for  a  population,  the  rate  of  the 
behavior  or  experience  under  consideration  accord- 
ing to  the  given  probability  distribution  of  the  sam- 
pling exercise,  and  the  reported  response  rate. 

Rapport— A  broadly  defined  term  used  to  refer  to 
the  quality  of  the  relationship  of  interaction  between 
the  interviewer  and  respondent.  Usually  this  refers 
to  characteristics  of  warmth  and  friendliness  and 
open  communication  in  interpersonal  relationship. 

Recall  Period— The  time  period  over  which  a  respon- 
dent is  required  to  remember  what  events  have  oc- 
curred. This  period  is  characterized  by  the  total 
length  of  the  time  period  by  the  elapsed  time  from 
the  time  of  inquiry. 

ABC 
A^B  =  REFERENCE  PERIOD 
A  _»  C  =  RECALL  PERIOD 

B->C  =  LAG  PERIOD;  MAY  VARY  IN  TIME 

FROM  O  TO  ? 
A<B^C 

Record  Checks— The  comparison  of  information  pro- 
vided by  a  respondent  in  a  survey,  with  information 
obtained  from  other  sources,  especially  governmental 
or  institutional  records  including  census,  Social  Se- 
curity, vital  records,  dispensaries,  hospitals,  mental 
health  agencies,  pharmacies,  and  municipal  activi- 
ties such  as  police  and  fire  department  functions. 
Reliability— Correspondence,  repeatability  or  consist- 
ency between  identical  survey  questions,  at  two  dif- 
ferent times. 


Respondent  Burden— The  level  of  demand  placed 
upon  the  respondent  necessary  to  answer  the  ques- 
tions in  the  survey  instrument.  This  includes  the 
total  time  demands  on  the  respondent,  the  demands 
on  his  memory,  difficulty  in  understanding  the  ques- 
tion and  possible  embarrassment. 

Response  Rate— The  percentage  of  an  eligible  sample 
for  whom  information  is  obtained.  For  an  interview 
survey  the  numerator  of  the  formula  is  the  number 
of  interviews.  The  denominator  is  the  total  sample 
size  minus  non-eligible  respondents;  that  is,  minus 
those  not  meeting  the  criteria  for  a  potential  re- 
spondent as  denned  for  that  particular  study. 

Response  Set— A  tendency  to  respond  in  a  particular 
way  based  on  a  stimulus  other  than  the  content  of  the 
question. 

Social  Desirability  Bias— Answers  which  reflect  an  at- 
tempt to  enhance  some  socially  desirable  characteris- 
tics or  minimize  the  presence  of  some  socially  unde- 
sirable characteristics.  Source  of  the  expectations  or 
values  influencing  answers  can  be  the  person  himself 
(ego-threatening) ,  the  perception  of  the  interviewer, 
or  society  as  a  whole;  may  give  rise  to  an  acquies- 
cent response. 

Standardized  Modules  or  Measures— Tested  and  vali- 
dated measures  of  major  variables,  such  as  those  deal- 
ing with  illness  and  demographic  characteristics.  The 
use  of  standard  measures  and  measuring  techniques 
provide  a  basis  for  comparability  of  information  from 
investigator  to  investigator. 

Stratification— A  design  technique  employed  in  sam- 
ple surveys  whereby  the  finite  population  is  classified 
into  several  parts  (or  strata)  and  a  random  sample 
is  independently  selected  from  each  stratum.  The  pur- 
pose of  stratification  is  to  reduce  the  sampling  var- 
iance. 

Telescoping— A  reporting  error  in  which  the  time  an 
event  occurred  is  remembered  as  having  been  more 
recent  than  it  actually  was.  Events  may  also  be  placed 
backward  in  time. 

Total  Survey  Design  (TSD)-A  concept  that  implies 
an  efficient  allocation  of  survey  resources  among  the 
different  error  components  in  order  to  minimize  the 
total  error  of  estimates. 

Total  Survey  Error— The  aggregate  of  all  components 
of  error  occurring  in  the  conduct  or  analysis  of  a 
sample  survey.  Included  in  the  total  survey  error  are 
all  sampling  and  nonsampling  errors. 
Total  Mean  Square  Error— In  a  survey  estimate  this 
is  the  expected  value  of  the  squared  difference  be- 
tween the  estimator  and  the  population  parameter 
being  estimated,  where  the  expectation  is  taken  over 
the  sampling  design  and  the  distribution  of  measure- 
ment errors. 

Unbalanced  Format— Form  of  a  question  in  which 
only  one  alternative  or  choice  is  stated  in  the  ques- 


tion. A  balanced  format  includes  both  alternatives 
or  all  choices. 

Underreporting— -Survey  responses  that  produce  lower 
estimates  of  the  incidence  of  some  event  of  charac- 
teristics than  is  accurate. 

Validity— A  valid  measure  is  one  that  measures  what 
it  claims  to  and  not  something  else.  Validity  is  a  con- 
tinuous concept  so  most  measures  fall  between  total 
validity  and  total  nonvalidity.  A  totally  valid  measure 
is  one  without  bias. 
Variance— See  Error  Model 
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